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ANnOM TRANSPQSQN TNSKRTIQN IN STAPHYLOCOCCUS 
AU REUS AND USV. TTrRRKQELT O mENTIFY ESSENTIAL GENES 



Field of Invention 

The present invention relates to a novel method of generating random transposon insertions in 
the genome of Staphylococcus aureus (5. aureus). The present invention further relates to the use of 
random transposon mutants generated by such method to identify putative essential S. aureus genes. 
The invention further relates to the use of such genes in screening assays to identify, evaluate or design 
antibacterial agents useful for the treatment of Staphylococcus infections and for the production of 
Staphylococcus vaccines. Such antibacterial agents are useful for treating or preventing opportunistic 
infections in immunocompromised individuals and for treating and preventing hospital acquired 
staphylococcus infections, septicemia, endocarditis, scarlet fever and toxic-shock syndrome associated 
v^th Staphylococcus infection. Also disclosed is a Bayession statistical model that may be used to 
increase the statistical confidence that any given gene identified using the disclosed transposon insertion 
methodology is essential. 
Background of Invention 

S. aureus is a gram-positive bacterium grouped within Bacillus sp. on the basis of ribosomal 
KNA sequences. This immobile coccus grows in aerobic and anaerobic conditions, in which it forms 
grape-like clusters. Its main habitats are the nasal membranes and skin of warm-blooded animals, in * 
whom it causes a range of infections from mild to serve, such as pneumonia, sepsis, osteomyelitis, and 
infectious endocarditis. The organism produces many toxins and is highly effective at overcoming 
antibiotic effectiveness. In fact, S. aureus is one of the major causes of community-acquired and 
hospital-acquired infections, and its toxins include super-antigens that cause unique disease entities such 
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as toxic shock syndrome and Staphylococcus-associated scarlet fever. In 1961 it was first reported that 
this bacteria developed resistance to methicillin, invalidating almost all antibiotics including the most 
potent beta lactams. 

In this regard, reports of bacterial strains becoming resistant to known antibiotics are becoming 
more common, signaling that new antibiotics are needed to combat all bacterial infections, and 
particularly combat 5. aureus, an organism responsible for many nosocomial infections. Unfortunately, 
historically the identification of new antibiotics has been painstakingly laborious with no guarantee of 
success. Traditional methods have involved blindly and randomly testing potential drug candidate 
molecules, with the hope that one might be effective. Presently, the average cost to discover and 
develop a new drug is nearly $500 million, and average time for drug development is 15 years firom 
laboratory to patient. Clearly new identification and screening methods that will shorten and reduce the 
cost of this process are needed. 

A newly emerging regime for identifying new antibacterial agents is to first identify gene 
sequences and proteins required for bacterial proliferation of ("essential genes and essential proteins") 
and then conduct a biochemical and structural analysis of the particular target gene or protein in order to 
identify compounds that interact with the target. Such methodology combines molecular modeling 
technology, combinational chemistry and the means to design candidate drugs, and affords a more 
directed alternative to merely screening random conipounds with the hope that one might be effective 
for inhibiting or eradicating a particular bacteria. 

Nevertheless, even this preferred approach presents obstacles including the identification of 
essential genes and proteins, and the design of new assays for the genes thus identified in order to 
efficiently screen candidate compounds. With report to this approach, several groups have proposed 
systems for the identification of essential genes. For instance, Zyskind and colleagues propose a method 
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of identifying essential genes in Escherichia coli by subcloning a library ofE. coli nucleic acid 
sequences into an inducible expression vector, introducing the vectors into a population ofE. coli cells, 
isolating those vectors that, upon activation and expression, negatively impact the growth of the E, coli 
cell, and characterizing the nucleic acid sequences and open reading frames contained on the subclones 
identified. See WO 00/44906, herein incorporated by reference. The disadvantage of this method is that 
the overexpression of nonessential genes can also negatively impact the cell, particularly the 
overexpression of membrane proteins and sugar transport proteins that are not necessary for growth 
where alternative carbon sources exist. Such proteins typically become trapped in membrane export 
systems when the cell is overloaded, and would be identified by this methodology. See Muller, FEMS 
Microbiol. Lett. 1999 Jul l;176(l):219-27. 

Another group proposes the identification of growth conditional mutants, and more specifically 
temperature sensitive (ts) mutants, as a means to identify essential genes in Staphylococcus aureus. See 
Benton et al., U.S. Patent 6,037,123, issued March 14, 2000, herein incorporated by reference. Each 
gene is identified by isolating recombinant bacteria derived firom growth conditional mutant strains, i.e., 
following introduction of a vector containing a library of nucleic acid sequences, which would grow 
under non-pennissive conditions but which were not revertants. These recombinant bacteria were found 
to contain DNA inserts that encoded wild type gene products that replaced the function of the mutated 
gene under non-permissive growth conditions. By this method, Benton and colleagues were able to 
identify 38 loci on the S. aureus chromosome, each consisting of at least one essential gene. 

The disadvantages of this method are first, the chemical employed to induce mutagenesis 
(diethyl sulfate, DES) is capable of causing several mutations in the same cell, thereby complicating 
interpretation of the results. Second, the method is particularly labor intensive in that one must 
painstakingly analyze replica plates of individual colonies grown at permissive and non-permissive 
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temperatures, where replica plates include both mutant and non-mutant cells. Thus, employing the 
appropriate level of mutagen to achieve a balance between minimizing the number of non-mutant 
colonies one must screen in order to identify one mutant, while at the same time avoiding multiple 
mutations in the same cell, may be an arduous task. 

Another group has proposed a transposon mutagenesis system for identifying essential genes 
called "GAMBIT" ("genomic analysis and mapping by in vitro transposition"), and has used the system 
to identify essential genes first in the gram positive bacteria Haemophilus influenzae and Streptococcus 
pneumoniaey and more recently in Pseudomonas aeruginosa. See Akerley et al.. Systematic 
identification of essential genes by /?j vitro 7narmer mutagenesis, Proc.Natl. Acad. Sci USA 95(15): 
8927-32; Wong and Mekalanos, 2000, Proc. Natl. Acad. Sci. USA 97(18): 10191-96; and Mekalanos et 
al., U.S. Patent No. 6,207,384, issued March 27, 2001, herein incorporated by reference. GAMBIT 
involves first isolating and purifying specific genomic segments of approximately lOkilobases using 
extended-length PCR, and creating a high density transposon insertion map of the isolated region using 
Himarl transposon mutagenesis. The transposon insertions are then transferred to the chromosome 
following transformation of the bacteria with the transposon containing vectors, and selection lor the 
antibiotic resistance marker on the transposon. The position of each transposon insertion with respect to 
a given PCR primer is then determined by genetic footprinting, i.e., by amplifying sub-PCR products 
using one of the original PCR primers and a primer that recognizes an internal site in the Himarl 
transposon. By analyzing the length of PCR firagments thus identified, it is possible to identify regions 
that are devoid of transposon insertions, thereby signaling regions that might contain essential genes. 

While the GAMBIT method is a good technique for looking at a small region of the genome for 
essential genes, it would be extremely labor intensive to use this method for analyzing the entire 
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genome. Furthermore, GAMBIT is not readily applicable for use in organisms that are less 
recombinogenic than H. influenzae. 

Another group at Abbott Laboratories has proposed a genome scanning method for identification 
of putative essential genes in H, influenzae^ whereby random transposon insertions are mapped and 
analyzed to identify open reading frames containing no insertion in order to identify putative essential 
genes. Reich et aL, 1999, Genome Scanning in Haemophilus influenzae for Identification of Essential 
Genes, J. Bacteriol. 181(16): 4961-68. However, even though transposon insertions were isolated that 
spanned the whole genome, the authors employed a genomic footprinting technique similar to that used 
in GAMBIT to map insertions in a short contiguous region of the chromosome. The method further 
employs the methods of mutation exclusion and zero time analysis in order to monitor the fate of 
individual insertions after transformation in growing culture, which looks at individual insertions on a 
case-by-case basis. 

Wong and Mekalanos also proposed identifying essential genes in P. aeruginosa by starting with 
the knowledge of three essential genes in H. influenzae and using genetic footprint analysis to determine 
if the homologues of these genes are essential in P. aeruginosa. Of three homologues tested, only one 
was unable to accommodate a transposon insertion. See Wong and Mekalanos, supra. Such results 
underscore the fact that a gene that is shown to be essential in one species will not necessarily be 
essential in another, given that some gene products may fulfill different functional roles in different 
species. 

Because of the fact that 5. aureus is a major cause of life-threatening infection, and its notorious 
resistance to antibiotics, various groups have reported approaches for identification ofS. aureus 
essential genes as these genes are useful potential targets for antibacterial chemotherapy and for 
producing therapeutic and prophylactic vaccines. 
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The availability of the genome sequence of S. aureus^ and related bacteria, makes possible 
studies attempting to identify genes that are essential for viability of the microorganism in vitro or for its 
ability to cause infection. The products of both types of genes are potential targets in the effort to 
produce effective antimicrobial agents. Related thereto, Kuroda et al. recently published in the Lancet 
the whole genome sequence of two related 5. aureus strains (N315 and Mu50) by shot-gun random 
sequencing. N315 is a meticillin-resistant S. aureus strain isolated in 1982 and Mu50 is an MRSA strain 
with vancomycin resistance isolated in 1997. In their paper Kuroda et al. reported the identification of 
open reading frames by the use of GAMBLER and GLIMMER programs, and annotation of each by 
BLAST homology search, motif analysis and protein localisation prediction. 

Also, Ji et al. recently reported a method for the identification of essential Staphylococcus genes 
using conditional phenotypes generated by antisense RNA. (Ji et al.. Science, 293: 2266-2269 
(September 21 , 2001)). Using this method, Ji et al. reported the identification of more than 1 50 putative 
essential Staphylococcus genes where antisense ablation was lethal or had growth inhibitory effects. Of 
these genes, 40% are reportedly orthologs or homologs of known essential bacteria] genes. 

Further, Xia et. al. recently reported a method reportedly useful for rapid identification of 
essential genes of Staphylococcus aureus using a vector host-dependent for autonomous replication, 
PSAS 1 82. This approach is based on the insertion by a single crossover of a specific DNA sequence 
both in the middle of a structural gene, with the inherent inactivation of the gene, and at its 3' end, 
where the insertion does not affect the structural gene but might have a polar effect on downstream 
genes (Xia et al., Plasmid 42:144-49(1999)). Their approach includes comparison of the frequency of 
the insertion at these two locations as a means for predicting of the essential character of a particular 
gene. Accordingly, in their strategy, for each studied gene, different fragments located either in the 
middle of a coding sequence or at its 3' end, are introduced into a vector host dependent for autonomous 
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replication, PSA3 1 82. Xia et al. rqjort the use of their approach to test the essential character of four S. 
aureus genes, nusG, divIB, dbpA and dbpB, 

Also, Jana et al. also recently reported a method for identifying genes that are essential in S. 
aureus, by fusing the gene of interest to an IPTG controllable spac promoter and provide a general 
approach by constructing a plasmid in which the Cat-Pspac cos sites is flanked by cloning sites suitable 
for inserting DNA fragments of interest (Jana et al., Plasmid 44:100-4 (2000)). 

Still further, Zhang et al. report a method for identifying essential genes of 5. aureus using a 
chromosomally-integrated spac system in combination with a Lac I-expressing plasmid PFF 40. This 
combination reportedly provides an inducible, titratable and well-regulated system for testing the 
requirements of specific gene products for cell viability and conditional lethal phenotypes in S. aureus, 
(Zhang et al. Gene 235: 297-305 (2000)). 

Another method for the identification of bacterial essential genes is entitled Transposon 
Mediated Differential Hybridisation (TMDH), which is disclosed in WO 01/07651, herein incorporated 
by reference. This method entails (i) providing a library of transposon mutants of the target organism; 
(ii) isolating polynucleotide sequences from the library which flank inserted transposons; (iii) 
hybridising said polynucleotide sequences with a polynucleotide library firom said organism; and (iv) 
identifying a polynucleotide in the polynucleotide library to which said polynucleotide sequences do not 
hybridise in order to identify an essential gene of the organism. However, the problem with this 
methodology is that it has a high propensity to lead to false positives, and many essential genes will be 
missed. Furthermore, the method does not yield any detailed information regarding the loci disrupted by 
transposons, or whether they were hit more than once. 

Previous attempts to generate random tranposon insertions in the S, aureus genome have 
encountered numerous difficulties. For instance, previous transposon systems for S, aureus have created 
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insertions predominantly concentrated in genomic "hot spots". In addition, difficulties have been 
encountered in obtaining viable S. aureus bacteria after electroporation procedures, making it difficult to 
generate a statistically significant number of mutations for mapping and to differentiate between 
essential and nonessential mutations. 

Thus, there is a great need for more efficient methods to identify essential genes, particularly in 
S. aureus so that new antibacterial agents may be designed therefi-om for use in treatment of 5. aureus 
infections. 

Summary of Invention 

The present inventors have developed a novel and efficient method for generating random 
transposon insertions in the Staphylococcous genome, preferably in the genome of 5, aureus. The 
inventive method provides for random insertion into the entire bacterial Staphylococcus genome. 

The methods of the invention fiirther provide a method for generating a random insertion into a 
Staphylococcus genome comprising subjecting Staphylococcal cells to random mutagenesis and 
culturing the mutagenized cells in a recovery broth. Preferably, the recovery broth is B2 Broth. 

The recovery broth used in the invention preferably comprises B2 Broth. The B2 Broth used in 
the invention comprises from 0.5% to 1.5% casein hydrolysate, preferably 1.0% casein hydrosylate, 
from 2.0% to 3.0% yeast extract, preferably 2.5% yeast extract, from 2.0% to 3.0% NaCl, preferably 
2.5% NaCl, and fi-om 0.05% to 0.15% K2HPO4, preferably 0.1% K2HPO4. The B2 Broth used in the 
invention is preferably buffered to about pH 7.0. 

Methods of subjecting cells to random mutagenesis are known in the art, and include, for 
instance, commercially available transposon mutagenesis products. 

More particularly, using this novel random transposon insertion method, the present inventors 
have generated >7400 viable transposon mutants, and have determined through PCR and DNA 
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sequencing the genomic insertion site of a majority of these mutants. Since the insertion of a transposon 
DNA into a bacterial genome disrupts the function of the gene at a particular location, the generation of 
a viable transposon mutant provides direct evidence that the disrupted gene contained in the particular 
mutant is not essential to the bacteria survival under the tested growth conditions. Accordingly, by 
systematically repeating the subject random transposon insertion method, it is anticipated that all or 
substantially all 5. aureus non-essential genes can be identified, based on the successful generation of 
viable transposon mutants which contain a transposon DNA inserted into the particular non-essential 
gene. Thus, putative essential genes are identified by elimination, i.e., putative essential genes are S. 
aureus where no transposon mutants are generated containing a transposoti DNA inserted therein. (As 
discussed in greater detail infira, the probability that a putative essential gene identified according to the 
invention is in fact essential also depends on the size of the particular gene, and can be further validated 
by use of statistical methods). 

Moreover, the present inventors have developed a method that is useful for providing a database 
of potential essential or otherwise important 5. aureus genes which may be used to verify essentiality 
and to design antibacterial agents active agamst the identinea targets. 

Also, the invention encompasses the use of essential genes and proteins identified by the 
invention transposon mutagenesis protocols to produce therapeutic and prophylactic vaccines for 
conferring therapeutic and prophylactic immunity against Staphylococcus infection. These vaccines will 
comprise the bacterial antigen or fragment thereof identified by the invention, antibodies that 
specifically bind the antigen, including both polyclonal, monclonal and nonclonal, or may comprise 
nuclear acid sequence based vaccines that contain a DNA sequence that encodes the said antigen or 
antigen fragment or antibody specification thereto. 
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Additionally, the invention allows for the identification of "motifs", of the essential genes 
identified by the invention, i.e., regions of the gene which are similar or related to that of other bacterial 
genes, and the use of these motifs as targets to screen compound libraries for compounds that inhibit or 
inactivate a desired gene function. 

Particularly, the inventors have generated >7400 transposon mutants and have determined the 
genomic insertion site of most of these mutants via PCR and DNA sequencing. Using the publicly 
available S. aureus genomic sequence, a map of transposon insertions is then generated, preferably using 
a library of at least about 3,000 to 6,000 transposon insertions, and more preferably using a library of at 
least about 4,000 to 5,000 transposon insertions. The generated map is used to provide a database of 
about 500 to 1500 open reading frames, or more particularly 1000 to 1400 reading frames for which no 
transposon insertions are obtained, each of which represents a potential essential gene required for 
growth and proliferation of S.aureus in the growth media and conditions disclosed infra in the 
experimental protocols or an important gene, the mutation of which results in an attenuated growth 
mutant. 

Thus, one aspect of the invention is to provide a database of putative essential important genes, 
defined by the absence of transposon insertions in those genes in a High Througjiput Transposon 
his^ion Map (HTTIM) database comprising about 3000 to 8000 transposon insertions in the genome of 
iS. aureus. Minimally, such a database comprises approximately 1294 open reading frames (ORFs), each 
of which may be further tested for essentiality using a variety of tests disclosed herein. However, 
predictions of essentiality may be bolstered based on length of the ORF and predicted function and other 
statistical factors, thereby providing for more narrow databases of putative essential genes. Thus, the 
invention also encompasses the production of databases that are more narrow and comprise only those 
genes for which essentiality may be predicted with at least an 80% confidence level, and include at least 
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about 600 to 625 genes. The invention also includes databases assigned a.confidence level of about 85% 
and including at least about 530 to 543 genes. The invention further includes databases assigned a 
confidence level of about 90% including at least about 400 to 407 genes. Further, the invention includes 
databases assigned a confidence level of about 95% and including at least about 240 to 246 genes. 

The transposon insertion map and database of putative essential open reading frames (ORFs) 
obtained may be used to confirm the essentiality of genes, for example by integration knock outs in the 
presence of chromosomal complementation or by integration and activation of a regulatable promoter. 
An "essential" gene is one that cannot be "knocked out," i.e. for which null mutants having complete 
absence of the gene product are not viable. This does not mean, however, that such genes could not 
tolerate point mutations or truncations that preserve sufficient gene product function so as to enable cell 
growth and survival. Essential genes are to be distinguished from "important" genes in that a "knock 
out" of an important gene does not lead to cell death but rather results in an attenuated growth mutant. 
Such genes may be included in the database of open reading fi-ames not hit by random transposon 
mutagenesis as described herein, because attenuated growth colonies maybe significantly smaller than 
the average 5. aureus colony and may have been overlooked when transposon insertion mutants were 
picked to generate the high throughput transposon insertion database (HTTIM). 

Nevertheless, important gene products may interact with or regulate other genes, gene products 
or cellular processes that are essential, thereby making such gene products appropriate targets for drug 
design. Moreover, most drugs do not eflfectively kill all the pathogenic bacteria in the body; rather, they 
kill or growth attenuate a portion of the bacteria, empowering the immune system to target the 
remainder. Hence, important genes that, when targeted with an antibacterial agent, result in attenuated 
growth, are also targets for the antibacterial drugs of the present invention. 
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Such attenuated mutants grow more slowly than wild type, and may grow more slowly due to 
reduced expression of an essential gene, i.e., transposon is in a gene that regulates expression of an 
essential gene, or due to expression of a truncated form of an essential gene, i.e., transposon is in the 
essential gene itself and leads to expression of a truncated mRNA. For example, mutants that show a 
higher drug susceptibility could be the result of insertions in a gene that potentiates resistance, such an 
efflux pump, or due to reduced expression of essential genes involved in the mechanism of action of the 
drug. Expression of mutated forms of essential and important genes may make the cell more susceptible 
to compounds that inhibit that particular gene or gene product, and may allow the identification of 
antibacterial agents with greater sensitivity. Furthermore, screening in whole cells overcomes the 
potential problems of uptake and efflux that are sometimes an issue for compounds identified via 
enzyme-based assays. 

The essential and important genes of the invention maybe used to design, screen for and 
evaluate potential antibacterial agents for the purpose of developing new treatments for S. aureus 
infection. Antibacterial agents identified according to the invention may have activity against the gene 
or against the corresponding gene product or metabolic pathways requiring the gene product. For 
instance, antibacterial agents according to the invention may include antisense nucleic acids or 
regulatory proteins that bind to open reading frames, to upstream polar sequences or to promoters that 
drive expression of the genes encoded by such open reading fi-ames. Active agents according to the 
invention may also include antibodies or proteins that bind to proteins encoded by open reading fi-ames, 
or to transcriptional or translational regulators of such genes or proteins, or to binding partners of such 
proteins. Agents may also be chemical compounds designed following molecular modeling of essential 
gene products according to the invention, or mutant proteins designed therefirom that compete with the 
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essential wild type protein for reactive cell components or for interacting nutrients, as well as agents 
from random chemical libraries. 

The present invention therefore includes methods and assays for identifying antibacterial agents 
having specificity for the essential or important open reading frames identified, or to genes and proteins 
that interact with such open reading frames or the products encoded thereby. Once essential and 
important open reading frames are identified, antibacterial agents may be identified using the assays and 
methods described herein, or by any suitable assay. Such assays may vary depending on the function 
delineated for each essential locus, as would be apparent to those of skill in the art. For instance, 
enzyme assays may be designed based on the predicted function of essential and important genes in 
order to define classes of inhibitors to be tested. Also, random chemical libraries may be screened for 
activity against the isolated genes or gene products. Cell lines may be designed or isolated that 
demonstrate reduced expression of essential genes, thereby providing a sensitive screening tool for 
inhibitors that effect the activity of that gene or gene product as it functions in the celL Such cell lines 
may be devised from cells having transposon insertions that lead to attenuated growth, or may be 
constructed by the promoter swap techniques described hereinTby using a reguiatabie promoter tnat can 
be used to increase gene expression, allowing for confirmation of target specificity. Here, the minimal 
inhibitory concentration of the inhibitor is directly related to the expression level of the target gene, such 
that under low expression, an attenuated growth cell is more susceptible to an inhibitor than the wild 
type strain, and as you raise the expression level, the minimum inhibitory concentration (MIC) 
increases. The MIC shift will be consistent when the inhibitor acts on the regulated target. 

In addition, by targeting agents against more than one essential or important gene, the possibility 
of developing resistant bacterial strains is reduced. 
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Active agents and compounds can be formulated into phamiaceutical compounds and 
compositions, effective for treating and preventing Staphylococcus infections in accordance with the 
methods of the invention. Such therapy will be particularly useful in the hospital setting for preventing 
and treating nosocomial infections.. Depending on the activity of the essential or important gene 
targeted, such agents could also be useful in treating all types of Staphylococcus infections ranging fi-om 
bacteraemia and septicemia, urinary-tract infections, pneumonia and chronic lung infections, bum 
infections, food poisoning and other gastrointestinal infections. Staphylococcus associated scarlet fever, 
cancer, AIDS, endocarditis, dermatitis, osteochondritis, ear and eye infections, bone and joint infections, 
gastrointestinal infections and skin and soft tissue infections, including wound infections, pyoderma and 
dermatitis. Further, the invention provides pharmaceutical compositions appropriate for use in methods 
of treating bacterial infections described above. 

In particular, the invention provides therapeutic and prophylactic vaccines for conferring 
therapeutic or prophylactic immunity against Staphylococcus infection, containing 5. aureus antigens, 
fragments, motifs, antibodies specific thereto, or nucleic acid sequences encoding, optionally in 
""association with other anti-bactenal active agents and carriers orardjiivaiits: 

Also, the invention provides motifs of essential genes identified according to the invention which 
may be used to identify essential genes in other bacteria as targets to identify compounds for inhibiting 
or eradicating Staphylococcus. Further, motifs identified according to the invention may allow for 
inhibition of multiple essential genes. 
Brief Description of the Drawings 

Figure 1 . Depiction of a single crossover recombination event resulting in integration of a plasmid into 
the bacterial chromosome. Isolation of such recombinants indicates that the targeted gene is not 
essential. 
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Figure 2. Single crossover and integration of a plasmid resulting in the replacement of a wild type 
promoter with a regulatable promoter ("promoter swap" strategy). 

Figures 3-5 respectively contain schematics of plasmids pMOD, pMOD (Erm-1) and pMOD (Cm). 
Figures 6-8 respectively contain the sequences for pMOD, pMOD (Erm-l) and pMOD (Cm). 
Detailed Description of the Invention 

The essential open reading frames identified in the present invention are set forth in Table 1 . 
These open reading frames were originally part of a library of putative nucleic acid sequences generated 
fi-om S, aureus strain. The sequence of staph col, a staph aureus strain similar to RN4220, is available at 
http://ww.tigr.org/ti ar-scriDts/CMR2/GenomePae e^-s pl?database=gsa . which sequence is incorporated 
herein by The SA Numbers in Table 1 correspond to the Tigr number system. RN4220. Nevertheless, it 
is expected that the genes identified will be also be essential or important in related S. aureus strains as 
well as other Staphylococcus species, given the low sequence diversity that exists between S. aureus 
strains of widely diverse environments and the pronounced structural and fiinctional homology of gene 
products. Thus, it is expected that agents identified as antibacterial based on their interaction with genes 
or gene products S. aureus will be broadly applicable as antibacterial agents against a variety of 
Staphylococcus species as well as other bacteria including but not limited to Escherichia, Hemophilus. 
Vibrio. Borrelia, Enterococcus, Heliobacter, Legionella, Mycobacterium, Mycoplasma, Neisseria. 
Pseudomonas, Streptococcus, etc. 

Thus, the present invention encompasses an isolated nucleic acid molecule comprising a nucleic 
acid sequence encoding a polypeptide having at least 80% sequence identity to a polypeptide encoded 
by a nucleic acid sequojce selected firom the group consisting of the Staphylococcus aureus open 
reading fi-ames (ORFs) listed in Table 1 . More preferably, the present invention encompasses an 
isolated nucleic acid molecule comprising a nucleic acid sequence encoding a polypeptide having at 
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least about 85 to 90% sequence identity to a polypeptide encoded by a nucleic acid sequence selected 
from the group consisting of the Staphylococcus aureus open reading frameis (ORFs) listed in Table 1 . 
Even more preferably, the present invention encompasses an isolated nucleic acid molecule comprising 
a nucleic acid sequence encoding a polypeptide having at least about 90 to about 95% sequence identity 
to a polypeptide encod'ed by a nucleic acid sequence selected from the group consisting of the 
Staphylococcus aureus open reading frames (ORFs) listed in Table 1 . 

In particular, the invention encompasses isolated nucleic acid molecules comprising nucleic acid 
sequences encoding polypeptides having at least 80% sequence identity, or more preferably at least 
about 85 to 90 to 95% identity, to a polypeptide encoded by an essential or important nucleic acid 
sequence selected from the group consisting of the Staphylococcus aureus open reading frames (ORFs) 
listed in Table 1, wherein essentiality or importance of said nucleic acid sequence is determined by 
integration knock-out coupled with extra-chromosomal complementation. Likewise, the invention 
encompasses isolated nucleic acid molecules comprising nucleic acid sequences encoding polypeptides 
having at least 80% sequence identity, or more preferably at least about 85 to 90 to 95% identity, to a 
polypeptide encoded by an essential nucleic acid sequence selectedTrom the group consisting ol tfie 
Staphylococcus aureus open reading frames (ORFs) listed in Table 1, wherein essentiality or importance 
of said nucleic acid sequence is determined by integration of a regulatable promoter into the gene, or via 
any other suitable method. 

Given that the library of nucleic acid sequences encompassed in Table 1 provides an 
unprecedented tool useful for the identification of essential and otherwise important genes in 
Staphylococcus and the construction and isolation of attentuated mutants, the present invention includes 
a library of nucleic acid sequences consisting essentially of nucleic acid sequences having at least 70% 
sequence identity, or more preferably at least about 80 to 90 to 95% identity, to a nucleic acid sequence 
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selected from the group consisting of the Staphylococcus aureus open reading frames (ORFs) listed in 
Table 1, wherein said library of nucleic acid sequences is employed to identify essential or otherwise 
important genes or to construct or isolate attenuated mutants in Staphylococcus. 

Also encompassed in the invention is a map of at least about 3,000 to 6,000 transposon insertions 
in the genome of Staphylococcus aureus (High-Throughput Transposon Insertion Database or HTTIM), 
wherein said map is useful for identifying genes that are essential or important for survival of said 
Staphylococcus aureus, i.e., by permitting the generation of a database of open reading frames that do 
not contain a transposon insertion. 

Thus, the databases and libraries disclosed herein may be used to formulate useful subsets of 
these libraries and databases. Accordingly, the invention includes subsets of the databases and libraries 
disclosed. Moreover, such a group of mutants identified from the HTTIM database of transposon hits 
provides a useful subset database for comparing homologies with essential genes of other organisms, for 
computer modeling of potential antibacterial agents, etc. A particularly useful database subset is one 
containing essential genes from 5. aurem that are also identified as essential in other Gram negative or 
"Tjram positive bactena. Indeed, genes that have essential homologs m oxher bugs arc Klcelylo provide 
useful targets for broad spectrum antibacterial agents, i.e., agents that have broad spectrum activity as an 
antibacterial agent. 

Further, the databases and subset databases of the present invention may also be used as 
comparative tools with other like databases or database subsets to identify broad spectrum. For instance, 
particularly envisioned is an embodiment wherein the database of putative essential genes identified in 
5. aeureus is cross-referenced with a similar database formed from Pseudomonas aeruginosa, wherein 
homologues present in both databases signal a potential target for a broad spectrum antibacterial agent. 
Cross-referencing between P. aeruginosa and 5. aureus in particular will identify antibacterial targets 
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for identifying broad spectrum antibiotics active against both Gram negative and Gram positive bacteria. 
However, databases derived from any bacteria could be employed in such comparisons, as well as 
databases formed from yeast, fimgi, mycoplasma, and other potential pathogens. 

Also encompassed in the invention is the use of essentia] and important genes and the 

corresponding proteins expressed thereto in the design of vaccines for eliciting prophylactic or 

therapeutic immune responses against S. aureus. 

Such vaccines will typically comprise a S. aureus protein antigen or fragment or derivative 

thereof encoded by an essential or important gene. Preferably, the protein antigen expressed from a 

recombinant polynucleotide. Additionally, such antigens will preferably be a protein expressed on the 

surface of the bacteria. 

Where the invention is directed to a fragment of a protein encoded by an essential or important 
gene, said fragment is preferably at least 8 to 12 amino acids long, and even more preferably at least 
about 20 to 30 amino acids long. Preferably, the fragment comprises either a B cell or a T cell epitope. 

Where the invention is directed to a derivative of a protein encoded by an essential or important 
gene, said derivative may contain one or more amino acid substitutions, additions or aeieuons. 
Preferably, the amino acid substitutions are conservative amino acid replacements. Conservative amino 
acid replacements are those that take place within a family of amino acids that are related in their side 
chains. Genetically encoded amino acids are generally divided into four families: (1) acidic = aspartate, 
glutamate; (2) basic = lysine, arginine, histidine; (3) non-polar = alanine, valine, leucine, isoleucine, 
proline, phenylalanine, methionine, tryptophan; and (4) uncharged polar = glycine, asparagine, 
glutamine, cystine, serine, threonine, tyrosine. Phenylalanine, tryptophan, and tyrosine are sometimes 
classified jointly as aromatic amino acids. For example, it is reasonably predictable that an isolated 
replacement of a leucine with an isoleucine or valine, an asparate with glutamate, a threonine with a 
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serine, or a similar conservative replacement of an amino acid with a structurally related amino acid will 
not have a major effect on the biological activity. Polypeptide molecules having substantially the same 
amino acid sequence as the protein by possessing minor amino acid substitutions that do not 
substantially affect the functional aspects are encompassed v^th the scope of derivatives of the proteins 
of the invention. 

The polypeptide fragment or derivative is preferably immunologically identifiable with the 
polypeptide encoded by the essential or important gene. The polypeptide fragment or derivative is 
preferably immunogenic and is able to cause a humoral and/or cellular immune response, either alone or 
when linked to a carrier, in the presence or absence of an adjuvant. The polypeptide fragment or 
derivative may be fused to or incorporated into another polypeptide sequence. This other polypeptide 
sequence may include one or more other proteins, fragments or derivatives thereof encoded by an 
essential or important gene. The other polypeptide sequence may also include a polypeptide sequence 
which allows for presentation of the polypeptide fi-agment or derivative. 

Accordingly, the present invention encompasses an isolated polypeptide and fragments and 
derivatives thereof, wherein said polypeptide has at least 80% sequence identity to a polypeptide 
encoded by a nucleic acid sequrace selected fi-om the group consisting of the 5. aureus open reading 
fi-ames (ORFs) listed in Table 1. More preferably, the present invention encompasses an isolated 
polypeptide and fragments and derivatives thereof, wherein said polypeptide has at least about 85 to 
90% sequence identity to a polypeptide encoded by a nucleic acid sequence selected from the group 
consisting of the 5. aureus open reading frames (ORFs) listed in Table 1. Even more preferably, the 
present invention encompasses an isolated polypeptide and firagments and derivatives thereof, wherein 
said polypeptide has at least about 90% to about 95% sequence identity to a polypeptide encoded by a 
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nucleic acid sequence selected from the group consisting of the S. aureus open reading frames (ORFs) 
listed in Table 1. 

In particular, the invention encompasses isolated polypeptides and fragments and derivatives 
thereof, wherein said polypeptides have at least 80% sequence identity, or more preferably at least about 
85 to 90 to 95% identity, to a polypeptide encoded by an essential or important nucleic acid sequence 
selected from the group consisting of the S, aureus open reading frames (ORFs) listed in Table 1, 
wherein the essentiality or importance of said nucleic acid sequence is determined by integration knock- 
out couple with extra-chromosomal complementation. Likewise, the invention encompasses isolated 
polypeptides and fragments and derivatives thereof, wherein said polypeptides have at least 80% 
sequence identify, or more preferably at least about 85 to 90 to 95% identity, to a polypeptide encoded 
by an essential nucleic acid sequence selected from the group consisting of the S. aureus open reading 
frames (ORFs) listed in Table 1 , wherein essentiality or importance of said nucleic acid sequence is 
determined by integration of a regulatable promoter into the gene, or via any other suitable method. 

Also encompassed in the invention are therapeutic and prophylactic vaccines that comprise 
ligands that specifically bind antigens encoded by essential or important genes identified according to 
the invention, for use in, for instance, passive immunization. Preferred ligands are antibodies and 
antibody fragments that specifically bind the antigen encoded by the essential gene. Such antibodies 
may be polyclonal or monoclonal. Types of antibodies and antibody fragments include by way of 
examples murine antibodies, chimeric, antibodies, humanized antibodies. Fab fragments, Fab2 fragments 
and human antibodies and scFv's. Methods for producing antibodies and antibody fragments by 
recombinant and non-recombinant methods are well known to those skilled in the art. In some 
embodiments the antigen used in such passive immunization may be attached to a cytotoxic moiety, e.g., 
a radionuclide or other agent that is cytotoxic against the bacteria. 
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Further encompassed within the scope of the invention are cells or viral vectors that express on 
their surface a S. aureus essential gene, fragment or variant identified according to the invention. 

In the case of prophylactic vaccines, the vaccine will comprise an immunogenic coinposition 
comprising a prophylactically effective amount of an antigen, antibody, cells or vector expressing an 
antigen encoded by an essential or important gene and will be formulated such that upon administration 
it elicits a protective immune response. In the case of therapeutic vaccines, the vaccine will comprise an 
immunogenic compostiion comprising a therapeutically effective amount of an antigen, antibody, cells 
or vectors expressing an antigen encoded by an essential or important gene and will be formulated such 
that upon administration it elicits a therapeutic immune response. Dosage effective amounts of 
prophylactic and therapeutic vaccines will be determined by known methods and will typically vary 
from about 0.00001 g/kg body weight to about 5-10 g/kg body weight. 

The immunogenic compositions of the invention can be administered by known methods, i.e., 
mucosally or parenterally. 

Suitable routes of mucosal administration include oral, intranasal (IN), intragastric, pulmonary, 

mtestinal, rectal, ocu larTand vaginal rOUt e5rTreierab1yrnmcosal-administratioTriyoral-t>rin^ 

Where mucosal administration is used, the immunogenic composition is preferably adapted for 
mucosal administration. For instance, where the composition is administered orally, it may be in flie 
form of tablets or capsules (optionally enteric-coated), liquid, transgenic plants, etc. Where the 
composition is administered intranasally, it may be in the form of a nasal spray, nasal drops, gel or 
powder. Where the antigen composition is adapted for mucosal administration, it may further be 
formulated such that the antigen remains stable, for instance by the use of carriers and excipients. 
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The immunogenic compositions of the invention can further comprise a mucosal adjuvant. 
Mucosal adjuvants suitable for use in the invention include (a) E.coli heat-labile enterotoxin ("LT"), or 
detoxified mutants thereof, such as the K63 or R72 mutants; (B) cholera toxin ("CT"), or detoxified 
mutants thereof; or (C) microparticles (i.e., a particle of ~1 OOnm to -150/im in diameter, more 
preferably -200nm to -30;im in diameter, and most preferably -500nm to -lO/xm in diameter) formed 
fi-om materials that are biodegradable and non-toxic (e.g. a poly(of-hydroxy acid), a polyhydroxybutyric 
acid, a polyorthoester, a polyanhydride, a polycaprolactone etc.); (D) a polyoxyethylene ether or a 
polyoxyethylene ester {see International patent application WO 99/52549); (E) a polyoxyethylene 
sorbitan ester surfactant in combination with an ocioxynol (see International patent application WO 
01/21207) or a polyoxyethylene alkyl ether or ester surfactant in combination with at least one 
additional non-ionic surfactant such as an octoxynol {see International patent application WO 
01/2 11 52); (F) chitosan {e.g. International patent application WO 99/27960) and (G) an 
immunostimulatory oligonucleotide {e.g. a CpG oligonucleotide) and a saponin {see International patent 
application WO 00/62800). Other mucosal adjuvants are also available {e.g. see chapter 7 of Vaccine 
design: the snbunit and adjuvant aproach^ eds. Powell & Newman, Plenum Press 1995 (ISBN 0-306- 
44867-X). 

Mutants of LT are preferred mucosal adjuvants, in particular the "K63" and "R72" mutants {e.g. 
see International patent application WO 98/18928), as these result in an enhanced immune response. 

Microparticles are also preferred mucosal adjuvants. These are preferably derived from a poly(05- 
hydroxy acid), in particular, fi-om a poly(lactide) ("PLA")> a copolymer of D,L-lactide and glycolide or 
glycolic acid, such as a poly(D,L-lactide-co-glyco]ide) ("PLG" or "PLGA"), or a copolymer of D,L- 
lactide and caprolactone. The microparticles may be derived fi-om any of various polymeric starting 
materials which have a variety of molecular weights and, in the case of the copolymers such as PLG, a 
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variety of lactiderglycolide ratios, the selection of which will be largely a matter of choice, depending in 
part on the coadministered antigen. 

Antigen may be entrapped within the microparticles, or may be adsorbed to them. Entrapment 
within PLG microparticles is preferred. PLG microparticles are discussed in further detail in Morris et 
al., (1994), Vaccine, 12:5 - 1 1, in chapter 13 of Mucosal Vaccines, eds. Kiyono et al., Academic Press 
1996 (ISBN 012410587), and in chapters 16 & 18 of Vaccine design: ihesubunit and adjuvant 
aproach, eds. Powell & Newman, Plenum Press 1995 (ISBN 0-306-44867-X). 

LT mutants may advantageously be used in combination with microparticle-entrapped antigen, 
resulting in significantly enhanced immune responses. 

Suitable routes of parenteral administration include intramuscular (IM), subcutaneous, 
intravenous, intraperitoneal, intradermal, transcutaneous, and transdermal (see e.g.. International patent 
application WO 98/20734) routes, as well as delivery to the interstitial space of a tissue. 

The immunogenic compositions of the invention may be adapted for parenteral administration 
(e.g., in the form of an injectable, which will typically be sterile and pyrogen-free). 

The immunogenic composition may further comprise a parenteral adjuvant. Parenteral adjuvants 
suitable for use in the invention include: (A) aluminum compounds (e.g. aluminum hydroxide, 
aluminum phosphate, aluminum hydroxyphosphate, oxyhydroxide, orthophosphate, sulfate etc. (e.g. see 
chapters 8 & 9 of Vaccine design: the subunit and adjuvant aproach, eds. Powell & Newman, Plenum 
Press 1995 (ISBN 0-306-44867-X) (hereinafter '"Vaccine design"'), or mixtures of different aluminum 
compounds, with the compounds taking any suitable form (e.g. gel, crystalline, amorphous etc.\ and 
with adsorption being preferred; (B) MF59 (5% Squalene, 0.5% Tween 80, and 0.5% Span 85, 
formulated into submicron particles using a microfluidizer) (see Chapter 10 of Vaccine design; see also 
International patent application WO 90/14837); (C) liposomes (see Chapters 13 and 14 of Vaccine 
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design); (D) ISCOMs {see Chapter 23 of Vaccine desig?i); (E) SAF, containing 10% Squalane, 0.4% 
Tween 80, 5% pluronic-block polymer LI 21, and thr-MDP, either microfluidized into a submicron 
emulsion or vortexed to generate a larger particle size emulsion {see Chapter 1 2 of Vaccine design); (F) 
Ribi™ adjuvant system (RAS), (Ribi Immunochem) containing 2% Squalene, 0.2% Tween 80, and one 
or more bacterial cell wall components from the group consisting of monophosphorylipid A (MPL), 
trehalose dimycolate (TDM), and cell wall skeleton (CWS), preferably MPL + CWS (Detox™); (G) 
saponin adjuvants, such as QuilA or QS21 {see Chapter 22 of Vaccine design), also known as 
Stimulon™; (H) ISCOMs, which may be devoid of additional detergent (International patent 
application WO 00/07621); (I) complete Freund's adjuvant (CFA) and incomplete Freund's adjuvant 
(IFA); (J) cytokines, such as interleukins {e.g. IL-1, IL-2, IL-4, IL-5, IL-6, IL-7, IL-12, etc.), interferons 
{e.g. interferon-7), macrophage colony stimulating factor, tumor necrosis factor, eic. (see Chapters 27 & 
28 of Vaccine design); (K) microparticles {see above); (L) monophosphoryl lipid A (MPL) or 3-0- 
deacylated MPL (3dMPL) {e.g. chapter 21 of Vaccine design); (M) combinations of 3dMPL with, for 
example, QS21 and/or oil-in-water emulsions (European patent applications 0835318, 0735898 and 
0761231); (N) oligonucleotides comprising CpG motifs {see Krieg (2000) Vaccine, 19:618 - 622; 
Kiieg (2001) Curr. Opin. Mol. Then, 2001, 3:15 - 24; WO 96/02555, WO 98/16247, WO 98/18810, 
WO 98/40100, WO 98/55495, WO 98/37919 and WO 98/52581, etc.) i.e. containing at least one CG 
dinucleotide, with 5-methylcytosine optionally being used in place of cytosine; (O) a polyoxyefliylene 
ether or a polyoxy ethylene ester (International patent application WO 99/52549); (P) a polyoxyethylene 
sorbitan ester surfactant in combination with an octoxynol (International patent application WO 
01/21207) or a polyoxyethylene alkyl ether or ester surfactant in combination with at least one 
additional non-ionic surfactant such as an octoxynol (International patent application WO 01/21152); 
(Q) an immunostimulatory oligonucleotide (e.g. a CpG oligonucleotide) and a saponin (International 
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patent application WO 00/62800); (R) an immunostimulant and a particle of metal salt (International 
patent application WO 00/23 1 05); (S) a saponin and an oil-in-water emulsion (International patent 
application WO 99/1 1241); (T) a saponin {e.g. QS21) + 3dMPL + IL«12 (optionally + a sterol) 
(International patent application WO 98/57659); and (U) other substances that act as immunostimulating 
agents to enhance the effectiveness of the composition {e.g. see Chapter 7 of Vaccine design). 
Aluminium compounds and MF59 are preferred adjuvants for parenteral use. 
The immunognic compositions of the invention may be administered in a single dose, or as part 
of an administration regime. The regime may include priming and boosting doses, which may be 
administered mucosally, parenterally, or various combinations thereof 

In some instances the vaccines of the invention may comprise several antigens, fragments or 
variants encoded by essential genes identified according to the invention. Alternatively, the vaccine 
may further comprise antigens identified by other methods, or specific to other bacteria, e.g., in order to 
provide multivalent vaccines. 

With respect to libraries according to the invention, a library of polynucleotides or a library of 
transposon insertion sites is a collection ol sequence information, which mtormation is providedTn 
either biochemical form (e.g., as a collection of polynucleotide molecules), or in electronic form (e.g., as 
a collection of polynucleotide sequences stored in a computer-readable form, as in a computer system 
and/or as part of a computer program). The sequence information of the polynucleotides can be used in 
a variety of ways, for instance as a resource for gene discovery, i.e., for identifying and verifying 
essential and important genes in Staphylococcus aureus, or for identifying essential or important 
homologues in other genera or species. A polynucleotide sequence in a library can be a polynucleotide 
that represents an mRNA, polypeptide, or other gene product encoded by the polynucleotide, and 
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accordingly such a polynucleotide library could be used to formulate coiresponding RNA or amino acid 
libraries according to the sequences of the library members. 

The nucleotide sequence information of the library can be embodied in any suitable fonn, e.g., 
electronic or biochemical forms. For example, a library of sequence information embodied in electronic 
form comprises an accessible computer data file (or, in biochemical form, a collection of nucleic acid 
molecules) that contains the representative nucleotide sequences of essential and important genes and/or 
insertion mutants that are differentially expressed {e.g., attenuated growth mutants). Other combinations 
and comparisons of cells afiFected by various diseases or stages of disease will be readily apparent to the 
ordinarily skilled artisan. Biochemical embodiments of the library include a collection of nucleic acids 
that have the sequences of the genes or transposon insertion sites in the library, where the nucleic acids 
can correspond to the entire gene in the library or to a fragment thereof, as described in greater detail 
below. 

The polynucleotide libraries of the subject invention generally comprise sequence information of 
a plurality of polynucleotide sequences, where at least one of the polynucleotides has a sequence of any 
of the sequences in Table 1 . By plurality is meant at least 2, usually at least 3 and can include up to all 
of the sequences included in these tables. The length and number of polynucleotides in the library will 
vary with the nature of the library, e.g., if the library is an oligonucleotide array, a cDNA array, a 
computer database of the sequence information, etc. 

Where the library is an electronic library, the nucleic acid sequence information can be present in 
a variety of media. "Media" refers to a manufacture, other than an isolated nucleic acid molecule, that 
contains the sequence information of the present invention. Such a manufacture provides the genome 
. sequence or a subset thereof in a form that can be examined by means not directly applicable to the 
sequence as it exists in a nucleic acid. For example, the nucleotide sequence of the present invention. 
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e.g. the nucleic acid sequences of any of the polynucleotides of identified in Table 1, can be recorded on 
computer readable media, e.g, any medium that can be read and accessed directly by a computer. Such 
media include, but are not limited to: magnetic storage media, such as a floppy disc, a hard disc storage 
medium, and a magnetic tape; optical storage media such as CD-ROM; electrical storage media such as 
RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. One of skill in 
the art can readily appreciate how any of the presently known computer readable mediums can be used 
to create a manufacture comprising a recording of the present sequence information. "Recorded" refers 
to a process for storing information on computer readable medium, using any such methods as known in 
the art. Any convenient data storage structure can be chosen, based on the means used to access the 
stored infoimation. A variety of data processor programs and formats can be used for storage, e.g. word 
processing text file, database format, etc. In addition to the sequence information, electronic versions of 
the libraries of the invention can be provided in conjunction or connection vsdth other computer-readable 
infomiation and/or other types of computer-readable files searchable files, executable files, etc, 
including, but not limited to, for example, search program software, etc). 

By providing the nucleotide sequence in computer readable form, the information can be 
accessed for a variety of purposes. Computer software to access sequence information is publicly 
available. For example, the gapped BLAST (Altschul et al. Nucleic Acids Res, (1997) 25:3389-3402) 
and BLAZE (Brutlag et al Comp, Chem. (1993) 17:203) search algorithms on a Sybase system can be 
used to identify open reading frames (ORFs) within the genome that contain homology to ORFs from 
other organisms. 

As used herein, "a computer-based system" refers to the hardware means, software means, and 
data storage means used to analyze the nucleotide sequence information of the present invention. The 
minimum hardware of the computer-based systems of the present invention comprises a central 
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processing unit (CPU), input means, output means, and data storage means. A skilled artisan can readily 
appreciate that any one of the currently available computer-based system are suitable for use in the 
present invention. The data storage means can comprise any manufacture comprising a recording of the 
present sequence information as described above, or a memory access means that can access such a 
manufacture. 

"Search means" refers to one or more programs implemented on the computer-based system, to 
compare a target sequence or target structural motif, or expression levels of a polynucleotide in a 
sample, with the stored sequence information. Search means can be used to identify fragments or 
regions of the genome that match a particular target sequence or target motif A variety of known 
algorithms are publicly known and commercially available, e.g. MacPattem (EMBL), BLASTN and 
BLASTX (NCBI). A "target sequence" can be any polynucleotide or amino acid sequence of six or 
more contiguous nucleotides or two or more amino acids, preferably from about 10 to 100 amino acids 
or from about 30 to 300 nucleotides. A variety of comparing means can be used to accomplish 
comparison of sequence information from a sample (e.g., to analyze target sequences, target motifs, or 
relative expression levels) with the data storage means. A skilled artisan can readily recognize that any 
one of the publicly available homology search programs can be used as the search means for the 
computer based systems of the present invention to accomplish comparison of target sequences and 
motifs. Computer programs to analyze expression levels in a sample and in controls are also known in 
the art. 

A "target structural motif," or "target motif," refers to any rationally selected sequence or 
combination of sequences in which the sequence(s) are chosen based on a three-dimensional 
configuration that is formed upon the folding of the target motif, or on consensus sequences of 
regulatory or active sites. There are a variety of target motifs known in the art. Protein target motifs 
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include, but arc not limited to, enzyme active sites and signal sequences. Nucleic acid target motifs 
include, but are not limited to, hairpin structures, promoter sequences and other expression elements 
such as binding sites for transcription factors. 

A variety of structural formats for the input and output means can be used to input and output the 
information in the computer-based systems of the present invention. One format for an output means 
ranks the relative expression levels of different polynucleotides. Such presentation provides a skilled 
artisan with a ranking of relative expression levels to determine a gene expression profile. 

As discussed above, the "library" as used herein also encompasses biochemical libraries of the 
polynucleotides of Table 1, e.g., collections of nucleic acids representing the provided polynucleotides. 
The biochemical libraries can take a variety of forms, e.g., a solution of cDNAs, a pattern of probe 
nucleic acids stably associated with a surface of a solid support (i.e., an array) and the like. Of 
particular interest are nucleic acid arrays in which one or more of the sequences identified in Table 1 is 
represented on the array. By "array" is meant an article of manufacture that has at least a substrate with 
at least two distinct nucleic acid targets on one of its surfaces, where the number of distinct nucleic acids 
can be considerably higher, typically being at least 10 nt, usually at least 20 nt and often at least 25 nt. 
A variety of different array formats have been developed and are known to those of skill in Ae art. The 
arrays of the subject invention find use in a variety of applications, including gene expression analysis, 
drug screening, mutation analysis and the like, as disclosed in the above-listed exemplary patent 
documents. 

hi addition to the above nucleic acid libraries, analogous libraries of polypeptides are also 
provided, where the polypeptides of the library will represent at least a portion of the polypeptides 
encoded by a gene corresponding to one or more of the sequences identified in Table 1 . 



29 



wo 2004/018624 



PCTAJS2003/025879 



"Identity" as it is used in the present invention should be distinguished from *Tiomology" or 
"homologous." In the context of the coding sequences and genes of this invention, 'Tiomologous" refers 
to genes whose expression results in expression products which have a combination of amino acid 
sequence similarity (or base sequence similarity for transcript products) and functional equivalence, and 
are therefore homologous genes. In general such genes also have a high level of DNA sequence 
similarity (i.e., greater than 80% identity when such sequences are identified among members of the 
same genus, but lower when these similarities are noted across bacterial genera), but are not identical. 
Relationships across bacterial genera between homologous genes are more easily identified at the 
polypeptide (i.e., the gene product) rather than the DNA level. The combination of functional 
equivalence and sequence similarity means that if one gene is useful, e.g., as a target for an antibacterial 
agent, or for screening for such agents, then the homologous gene is probably also useful, but may not 
react in the same manner or to the same degree to the activity of a specific antibacterial agent. 

Nevertheless, the identification of one such gene serves to identify a homologous gene through 
the same relationships as indicated above, and can serve as a starting point to determine whether the 
homologous gene is also essential, whether it responds to the same antibactenal agents, etc. 1 ypically; 
such homologous genes are found in other bacterial species, especially, but not restricted to, closely 
related species. Due to the DNA sequence similarity, homologous genes are often identified by 
hybridizing with probes from the initially identified gene under hybridizing conditions that allow stable 
binding under appropriately stringent conditions. For instance, nucleic acids having sequence similarity 
are detected by hybridization under low stringency conditions, for example, at 50°C and lOXSSC (0.9 
M saline/0.09 M sodium citrate) and remain bound when subjected to washing at 55°C in IXSSC. 
Sequence identity can be determined by hybridization under stringent conditions, for example, at SO^^C 
or higher and O.IXSSC (9 mM saline/0.9 mM sodium citrate). Hybridization methods and conditions 
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are well known in the art, see, e.g., USPN 5,707,829. Nucleic acids that are substantially identical to the 
provided polynucleotide sequences, e.g. allelic variants, genetically altered versions of the gene, etc., 
bind to the provided polynucleotide sequences under stringent hybridization conditions. By using 
probes, particularly labeled probes of DNA sequences, one can isolate homologous or related or 
substantially identical genes. The equivalent function of the product is then verified using appropriate 
biological and/or biochemical assays. 

Using such hybridization technique for the identification of homologous genes, it will be 
possible to screen other species of bacteria, particularly other genera of gram positive pathogenic 
bacteria although gram negative bacteria may also be screened, to determine if any essential or 
important gene identified herein has a homologue in that particular genus of bacteria. If so, such gene 
could be cloned and isolated for essentiality in the particular genus, and further tested for sensitivity or 
susceptibility to the antibacterial agents and inhibitors identified herein. Specific genera of bacteria 
particularly appropriate for hybridization screening for the presence of homologues of essential and 
important genes include Escherichia, Hemophilus, Vibrio, Borrelia, Enterococcus, Heliobacter, 
Legionella, Mycobacterium, Mycoplasma, Neisseria, Pseudomonas, Streptococcus, etc. 

"Identity," on the other hand, is gauged fi-om the starting point of complete homology. 
Thereafter, identity may be described in terms of percentages according to the number of base changes 
in the DNA sequence taking into account any gaps. For purposes of the present invention, variants of the 
invention have a sequence identity greater than at least about 65%, preferably at least about 75%, more 
preferably at least about 85%, and can be greater than at least about 90% or more as determined by the 
Smith-Waterman homology search algorithm as implemented in MPSRCH program (Oxford 
Molecular). A preferred method of calculating percent identity is the Smith-Waterman algorithm, using 
the following. Global DNA sequence identity must be greater than 65% as determined by the Smith- 
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Waterman homology search algorithm as implemented in MPSRCH program (Oxford Molecular) using 
an affine gap search with the following search parameters: gap open penalty, 12; and gap extension 
penalty, 1. 

Amino acid sequence variants are also included in the invention. Preferably, naturally or non- 
naturally occurring protein variants have amino acid sequences which are at least 85%, 90%, or 95% 
identical to the amino acid sequences identified herein, or to a shorter portion of these sequences. More 
preferably, the molecules are 98% or 99% identical. Percent sequence identity is determined using the 
Smith-Waterman homology search algorithm using an affine gap search with a gap open penalty of 12 
and a gap extension penalty of 2, BLOSUM matrix of 62. The Smith- Waterman homology search 
algorithm is taught in Smith and Waterman, AppL Math. (1981) 2:482-489. 

Also included in the invention are fragments of the nucleic acid sequences and amino acid 
sequences identified herein, as well as RNAs and RNA fragments corresponding to the DNA sequences 
disclosed. Such nucleic acid fragments are at least about 10 nucleotides, more preferably at least about 
20 to 25 nucleotides, and more preferably at least about 50 to 100 nucleotides, and can include any 
fragment or variant of a fragment. Such nucleic acid fragments may be used as probes for identifying 
similar or substantially identical or identical nucleic acid sequences in other genera, or as tools in 
constructing nucleic acid vectors for knock out and promoter swap experiments. Such amino acid 
fragments are at least about four amino acids in length, more preferably at least about 8 to 12 amino 
acids in length, and more preferably at least about 20 to 30 amino acids in length, and may be used as 
agonists or antagonists to test binding interactions of the proteins disclosed herein, or alternatively as 
immunogens to isolate antibodies that recognize and bind to specific epitopes of a target protein. 

Once a gene is identified as being essential or important for Staphylococcus growth on rich 
media or in any specific environment, the invention also encompasses the identification of antibacterial 



32 



wo 2004/018624 



PCT/US2003/025879 



agents that have specific activity against the essential or important genes or their gene products or the 
biochemical pathways in which they are involved. In this context, the term "biochemical pathway" 
refers to a connected series of biochemical reactions normally occurring in a cell, or more broadly a 
cellular event such as cellular division or DNA replication. Typically, the steps in such a biochemical 
pathway act in a coordinated fashion to produce a specific product or products or to produce some other, 
particular biochemical action. Such a biochemical pathway requires the expression product of a gene if 
the absence of that expression product either directly or indirectly prevents the completion of one or 
more steps in that pathway, thereby preventing or significantly reducing the production of one or more 
normal products or effects of that pathway. 

Thus, an agent specifically inhibits such a biochemical pathway requiring the expression product 
of a particular gene if the presence of the agent stops or substantially reduces the completion of the 
series of steps in that pathway. Such an agent, may, but does not necessarily, act directly on the 
expression product of that particular gene. An "expression product" of a gene means that, in a bacterial 
cell of interest, the gene is transcribed to form RNA molecules. For those genes that are transcribed into 
mKNAs, the mRNA is translated to form polypeptides. More generally, in this context, "expressed" 
means that a gene product is formed at the biological level that would normally have tiie relevant 
biological activity (i.e., RNA or polypeptide level). 

Thus, the invention includes a method of screening for an antibacterial agent, comprising 
determining whether a test compound is active against an essential or important bacterial gene identified 
by the methods herein. The invention also includes a method of screening for an antibacterial agent, 
comprising detennining whether a test compound is active against a protein encoded by an essential 
bacterial gene identified herein, or active to inhibit the biochemical pathway that involves said protein. 
The tenn "antibacterial agent" refers to both naturally occurring antibiotics produced by microorganisms 
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to suppress the growth of other microorganisms, and agents synthesized or modified in the laboratory 
which have either bactericidal or bacteriostatic activity. An "active" agent in this context will inhibit the 
growth of S". aureus and possibly related species. The lemi "inhibiting the growth" indicates that the rate 
of increase in the numbers of a population of a particular bacterium is reduced. Thus, the term includes 
situations in which the bacterial population increases but at a reduced rate, as well as situations where 
the growth of the population is stopped, as well as situations where the numbers of the bacteria in the 
population are reduced or the population even eliminated. If an enzyme activity assay is used to screen 
for inhibitore, one can make modifications in uptake/efflux, solubility, half life, etc. to compounds in 
order to correlate enzyme inhibition with grov^ inhibition. 

Assays may include any suitable method and may be expected to vary on the type of essential 
gene or protein involved. For instance, one embodiment is a method comprising the steps of: 

a) contacting said protein or a biologically active fragment thereof with a test compound; and 

b) determining whether said test compound binds to said essential gene product or protein or fragment of 
said protein; 

wherein binding of said test compound to said polypeptide or said fragment is indicative that said test 
compound is an antibacterial agent. It is quite common in identifying antibacterial agents, to assay for 
binding of a compound to a particular polypeptide where binding is an indication of a compound which 
is active to modulate the activity of the polypeptide. Binding may be determined by any means 
according to the agent tested and techniques known in the art. 

Also, agents that inhibit binding of two proteins or polypeptides may also be identified, for 
instance using a yeast two-hybrid system. Such a system will entail cloning the genes encoding each 
protein and expressing each in a reporter cell system such that interaction between the two proteins is 
monitored by observing the expression of a reporter gene. For instance, cDNAs cloned in a yeast two- 
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hybrid expression system (Chien et al. (1991) Proc. Natl. Acad. Sci. (U.SA.) 88: 9578; Zervos et al. 
(1993) Cell 72: 233) can be used to identify other cDNAs encoding proteins that interact with the 
protein encoded by the first, thereby produce expression of the GAL4-dependent reporter gene. 
Thereafter, cells expressing both proteins leading to expression of the reporter gene are used to screen 
for agents that interact with either protein, or the gene encoding either protein. Such systems are well 
known in the art and are well within the realm of ordinary skill. 

Another embodiment is a method for evaluating a test agent for inhibition of expression of an 
essential gene identified according to the methods herein, comprising: 

a) contacting a cell expressing said essential gene with said agent; and 

b) determining the amount or level of expression of said essential gene in said sample. 

The exact detennination method will be expected to vary depending on the characteristics of the 
expression product as would be readily apparent to one of ordinary skill in the art. Such methods can 
include, for example, antibody binding methods, enzymatic activity determinations, and substrate analog 
binding assays. Such level of expression could be monitored by monitoring the level of the product of 
the essential gene in the cell, i.e., by SDS-PAGE, or by colorimetric assays using, for example, a lacZ 
gene or protein fusion and detection on media using X-Gal or spectirophotometric detection. 

When such fusions are employed, fusions may be designed using the chromosomal gene so long 
as the fusion does not disrupt the function of the essential gene, i.e., as with a gene fusion where lacZ is 
inserted just downstream of the essential gene and is expressed from the same promoter as the essential 
gene. Alternatively, one could employ an extrachromosomal fusion construct whereby the wild type 
chromosomal copy of the gene is not disrupted. In this case, one could employ a protein fusion, i.e., 
where a portion of lacZ sufficient to be detected with a colorimetric test is fused in frame with the 
coding region of the essential gene such that a fusion protein is obtained. Other detectable or 
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measurable proteins commonly used in the art may be used as an alternative to lacZ, for instance, ;7Ac?.4, 
Lux/luciferase, etc. 

Another method of the invention for evaluating an potential antibacterial agent, comprises the 
steps of: 

a) providing a bacterial strain comprising a mutant or normal form of the essential or important 
gene, wherein said mutant form of the gene confers a growth conditional phenotype; 

b) contacting bacteria of said bacterial strain with a test compound in semi-permissive or 
permissive growth conditions; and 

c) determining whether the growth of said bacterial strain comprising said mutant form of a gene is 
reduced in the presence of said test compound to a greater extent than a comparison bacteria comprising 
a normal form of said gene. 

In this context, a "mutant form" of a gene is a gene which has been altered, either naturally or 
artificially, changing the base sequence of the gene, which results in a change in the amino acid 
sequence of an encoded polypeptide. The change in the base sequence may be of several different types, 
including changes of one or more bases for different bases, small deletions, and small insertions. 
Mutations may also include transposon insertions that lead to attenuated activity, i,e., by resulting in 
expression of a truncated protein. By contrast, a normal form of a gene is a form commonly found in a 
natural population of a bacterial strain. Commonly a single form of a gene will predominate in natural 
populations. In general, such a gene is suitable as a normal form of a gene, however, other forms which 
provide similar functional characteristics may also be used as a normal gene. In particular, a normal 
form of a gene does not confer a growth conditional phenotype on the bacterial strain having that gene, 
while a mutant form of a gene suitable for use in these methods does provide such a growth conditional 
phenotype. 
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As used in the present disclosure, the term "growth conditional phenotype" indicates that a 
bacterial strain having such a phenotype exhibits a significantly greater difference in growth rates in 
response to a change in one or more of the culture parameters than an otherwise similar strain not having 
a growth conditional phenotype. Typically, a growth conditional phenotype is described with respect to 
a single growth culture parameter, such as temperature. Thus, a temperature (or heat-sensitive) mutant 
(i.e., a bacterial strain having a heat-sensitive phenotype) exhibits significantly reduced growth, and 
preferably no growth, under non-permissive temperature conditions as compared to growth under 
permissive conditions. In addition, such mutants preferably also show intermediate growth rates at 
intermediate, or semi-permissive, temperatures. Similar responses also result from the appropriate 
growth changes for other types of growth conditional phenotypes. A growth conditional phenotype can 
also be conferred by cloning an essential or important gene behind a regulatable promoter, for instance, 
a promoter that is only active, or only leads to transcription, under particular environmental conditions 
or in response to a specific environmental stimulus. Such growth conditional promoter mutants may be 
isolated according to the promoter swap strategies described herein. 

"Semi-permissive conditions" are conditions in which the relevant culture parameter for a 
particular growth conditional phenotype is intermediate between permissive conditions and non- 
permissive conditions. Consequently, in semi-permissive conditions the bacteria having a growth 
conditional phenotype will exhibit growtii rates intermediate between those shown in permissive 
conditions and non-permissive conditions. In general, such intermediate growth rate is due to a mutant 
cellular component which is partially functional under semi-permissive conditions, essentially folly 
fonctional under permissive conditions, and is non-fonctional or has very low function under non- 
permissive conditions, where the level of fonction of that component is related to the growth rate of the 
bacteria. 
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The tenn "method of screening" means that the method is suitable, and is typically used, for 
testing for a particular property or effect in a large number of compounds. Therefore, the method 
requires only a small amount of time for each compound tested; typically more than one compound may 
be tested simultaneously (as in a 96-well microtiter plate, or in a series of replica plates), and preferably 
significant portions of the procedure can be automated. "Method of screening" also refers to determining 
a set of different properties or effects of one compound simultaneously. 

Because the essential and important genes identified herein can be readily isolated and the genes 
cloned into a variety of vectors known in the art, the invention also encompasses vectors comprising the 
nucleic acid sequences, open reading fi-ames and genes of the invention, as well as host cells containing 
such vectors. Because the essential genes identified herein can be readily isolated and the encoded gene 
products expressed by routine methods, the invention also provides the polypeptides encoded by those 
genes, as well as genes having at least about 50%, or more preferably about 60%, or more preferably 
about 70%, or more preferably about 80%, or more preferably about 90%, or most preferably about 95% 

protein sequence identity. 

Thus, by identifying certain essential and/or important genes, this invention provides a method of 
screening for an antibacterial agent by contacting a polypeptide encoded by one of the identified 
essential or important genes, or a biologically active fi^agment of such a polypeptide, with a test 
compound, and determining whether the test compound binds to the polypeptide or polypeptide 
fi^agment. In addition, to simple binding determinations, the invention provides a method for identifying 
or evaluating an agent active on one of the identified essential genes. The method involves contacting a 
sample containing an expression product of one of the identified genes with the known or potential 
agent, and determining the amount or level of activity of the expression product in the sample. 
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In particular, antibodies to essential and important gene products are anticipated to be suitable 
diagnostic binding and antibacterial agents. Thus, antibodies to the proteins encoded by the essential 
and important genes identified by the methods described herein are also included in the invention. Such 
antibodies may be isolated according to well known techniques in the art, i.e., Kohler and Milstein for 
monoclonal antibodies. Also included are polyclonal antibodies and antibody fragments such as Fv, Fab 
and Fab2 fragments, as well as chimeric and humanized antibodies, and human antibodies, i.e., made 
using a Xeno mouse. 

In a further aspect, this invention provides a method of diagnosing the presence of a bacterial 
strain having one of the genes identified above, by probing with an oligonucleotide at least 15 
nucleotides in length, which specifically hybridizes to a nucleotide sequence which is the same as or 
complementary to the sequence of one of the bacterial genes identified above. In some cases, it is 
practical to detect the presence of a particular bacterial strain by direct hybridization of a labeled 
oligonucleotide to the particular gene. In other cases, it is preferable to first amplify the gene or a portion 
of the gene before hybridizing labeled oligonucleotides to those amplified copies. 

In a related aspect, this invention provides a method of diagnosing the presence of a bacterial 
strain by specifically detecting the presence of the transcriptional or translational product of the gene. 
Typically, a transcriptional (RNA) product is detected by hybridizing a labeled RNA or DNA probe to 
the transcript. Detection of a specific translational* (protein) product can be performed by a variety of 
different tests depending on the specific protein product. Examples would be binding of the product by 
specific labeled antibodies and, in some cases, detection of a specific reaction involving the protein 
product. Diagnostic assays find particular use in assaying tissue and fluid samples of patients suspect of 
having a Staphylococcus infection. 
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Antibacterial agents identified according to the methods of the invention may be employed in 
pharmaceutical compositions. Such compositions may be administered to patients in order to treat an 
infection by or involving 5. aureus^ either alone or in combination wifli secondary agents targeted at, for 
instance virulence factors of 5. aureus, or other bacteria that may be present in addition to S, aureus. In 
this context, the term "administration" or "administering" refers to a metiiod of giving a dosage of an 
antibacterial pharmaceutical composition to a mammal, where the method is, e.g., topical, oral, 
intranasal, inhaled, intravenous, transdermal, intraperitoneal, or intramuscular. The preferred method of 
administration can vary depending on various factors, e.g., the components of the pharmaceutical 
composition, the site of the potential or actual bacterial infection, the bacterium involved, and the 
severity of an actual bacterial infection. 

As used above and throughout this application, "hybridize" has its usual meaning from molecular 
biology. It refers to the formation of a base-paired interaction between nucleotide polymers. The 
presence of base pairing implies that at least an appreciable fraction of the nucleotides in each of two 
nucleotide sequences are complementary to the other according to the usual base pairing rules. The 
exact fraction of the nucleotides which must be complementary in order to obtain stable hybridization 
will vary with a number of factors, including nucleotide sequence, salt concentration of the solution, 

temperature, and pH. 

The term, "DNA molecule", should be understood to refer to a linear polymer of 
deoxyribonucleotides, as well as to the linear polymer, base-paired with its complementary strand, 
forming double-strand DNA (dsDNA). The term is used as equivalent to "DNA chain" or "a DNA" or 
"DNA polymer" or "DNA sequence", so this description of the term meaning applies to those terms also. 
The term does not necessarily imply that the specified "DNA molecule" is a discrete entity with no 
bonding with other entities. The specified DNA molecule may have H-bonding interactions with other 
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DNA molecules, as well as a variety of interactions with other molecules, including RNA molecules. In 
addition, the specified DNA molecule may be covalently linked in a longer DNA chain at one, or both 
ends. Any such DNA molecule can be identified in a variety of ways, including, by its particular 
nucleotide sequence, by its ability to base pair under stringent conditions with another DNA or RNA 
molecule having a specified sequence, or by a method of isolation which includes hybridization under 
stringent conditions with another DNA or RNA molecule having a specified sequence. 

References to a "portion" of a DNA or RNA chain mean a linear chain which has a nucleotide 
sequence which is the same as a sequential subset of the sequence of the chain to which the portion 
refers. Such a subset may contain all of the sequence of the primary chain or may contain only a shorter 
sequence. The subset will contain at least 15 bases in a single strand. However, by "same" is meant 
"substantially the same"; deletions, additions, or substitutions of specific nucleotides of the sequence, or 
a combination of these changes, which affect a small percentage of the full sequence will still leave the 
sequences substantially the same. Preferably this percentage of change will be less than 20%, more 
preferably less than 10%, and even more preferably less than 3%. "Same" is therefore distinguished 
from "identical"; for identical sequences there cannot be any difference in nucleotide sequences. 

As used in reference to nucleotide sequences, "complementary" has its usual meaning from 
molecular biology. Two nucleotide sequences or strands are complementary if they have sequences that 
would allow base pairing between the strands according to the usual pairing rules. This does not require 
tfiat the strands would necessarily base pair at every nucleotide; two sequences can still be 
complementary with a low level of base mismatch such as that created by deletion, addition, or 
substitution of one or a few (up to 5 in a linear chain of 25 bases) nucleotides, or a combination of such 
changes. 



41 



wo 2004/018624 



PCT/US2003/02S879 



Other embodiments of the invention will be immediately envisaged by those of skill in the art 
upon reading the methods and examples to follow. Such examples are merely illustrative of the 
invention, and should not be construed as limiting the scope of the invention in any way. 
A. Methodology 

The following methods are used for generating transposon libraries in S. aureus. It should be 
emphasized that these methods are exemplary of methods which may be used to identify and map S. 
aureus essential genes and to construct a database ofS. aureus essential genes according to the 
invention. In particular, is should be understood that modification of these particular methods and 
protocols is within the scope of the invention and within the purview of the ordinary skilled artisan. 

1. Method for Obtaining Electrocompe tent S. aureus 

An overnight culture of 5. aureus was diluted 1 to 25 in B2 broth. pH 7.0 [1] and shaken at 37»C 
until the culture reached mid log phase, an ODaoo 0.6-0.8. The cells were then chilled on ice and washed 
with 500mM sucrose as described by landolo et Al. [2]. However the centrifuge condition of the 
procedure is modified to, lO.OOOg for 20 minutes. The final cell pellet is resuspended in a cold sucrose 
solution and immediately frozen at — OC as 35ul aliquots. 

2. TransPOSon Construction 

TN5 transposons are prepared using EZ::TN™pMOD™<MCS> Transposon Construction Vector 
and EZ::TN™ Transposase (Epicentre Technologies, Madison. WI). Initially two separate transposomes 
are designed using either chloramphenicol or erythromycin markers. Although both are successful in 
producing transposon mutants, the majority of the library is the result of the erythoimycin transposon as 
it produces more mutants per electroporation. The choloramphenicol marker is amplified from plasmid 
pCl 94 and cloned into the pMOD™<MCS>. Amplifications from pC194 are performed using the 
primers Cml94-HindF (5'-TATATaagcttGTTACAGTAATATTGACTTT-3') and Cml94-KpnR 
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(5'-TAACGggtaccGTTAGTGACATTAGAAAACC-3'). The erythromycin marker is amplified from 
plasmid pTLV-1 using the primers Erm917-HindF (5'-AAATaagcttTAGAAGCAAACTTAAGAGTG- 
30 and Erm91 7-KpnR (5'CGGTCGTTATggtaccATTCAAATTTATCC-3'). Each primer contains a 
restriction enzyme site, designated in lower case above, for cloning. The antibiotic markers are 
amplified from their respective plasmids under the following conditions: 94»C for 1 minute followed by 
30 cycles of 94°C for 1 min 30 sec, 60°C for 45 sec and 72°C for 1 min with a final extension time of 5 
min. The markers are then cloned into the MCS of plasmid pMOD™<MCS> Transposon Construction 
Vector. The transposon is then removed from the pMOD backbone by digestion with PvuII and run on 
an agarose gel. The DNA is purified from the agarose using QlAquick Gel Extraction Kit (Qiagen Inc., 
Valencia, CA). 1 00 ng per microliter is generally obtained. Transposomes are made-by mixing 500ng 
of the purified transposon DNA with 5 ul of sterile water or (10mm TRIS, pH8), 5 Units of EZ::TN™ 
Transposase (Eppicentre Technologies, Madison, WI) and 5 ul of 1 00% glycerol. The transposome 
reaction is mixed and incubated at room temperature for 30 minutes. 2 microliters of the transposome 

mixture is electroporated per aliquot of electrocompetent cells. 

3. Eletrotransformation ofS. aureus 

Prior to electroporation, the competent cell aliquots are thawed on ice. Once completely, 
thawed, the cells are mixed with 2ul of transposon and the volume is adjusted to 70ul with cold SOOmM 
sucrose. The cell mixture is then aliquoted into a pre-chilled 0.1cm gap electroporation cuvette. The 
mixture was then elecroporated as described by Laddaga et al. [1] using a Gene Pulser™ and pulse 
controller (Bio-Rad Laboratories Inc., Hercules, CA) (2.5KV, 25MF capacitance, 100 ohm resistance, 
time constant 2.0-2.4). The cells are then immediately resuspended in 1 .0 milliliter of B2 broth (lOmM 
CaCl2 and lOmM MgCh), incubated on ice for 5 minutes and transferred to a round bottom test tube 
and incubated with agitation at 37°C for 1 to 2 hours, depending upon the transposon marker. To induce 
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erythromycin expression of the transposon marker, halfway through the 37°C incubation, erythromycin 
is added at lOng/ml. The cells were then plated on NYE agar pH 7.0 [1] containing erythromycin 
(lug/ml) and lincomycin (5ug/ml) and incubated at 3TC for 48 hours. 
4. DNA Extraction 

Colonies are picked from the NYE antibiotic plates directly into a 96 deep well block containing 
0.5 milliliters per well B2 broth (plus appropriate antibiotics). The blocks are allowed to incubate at 
37°C for 24 hours with agitation. After 24 hours, 0.1 milliliter is transferred to a 0.2ml thin walled PGR 
plate using a multichannel pipette. Frozen stocks are also made from the deep well blocks and stored at 
-SO^C containing 1 0% (vol/vol) glycerol. The liquid in the PGR plates is pelleted by centrifugation at 
2,000rpm for 5 minutes. The supernatant is then removed and 150 microliters of a lysis cocktail is 
added to each well using a multichannel pipette and the plate is sealed with a sterile cap mat. The lysis 
cocktail consists of 1.0 mg/ml Lysoszyme (Sigma), 10 ug/ml Lysostaphin (Recombinant, AMBI Inc.) 
and Instagene Matrix (Bio-Rad). Once the lysis cocktail is added, the 96-well plates are incubated at 

37G for 30 minutes in a theimocycler with the lid heat turned off. During the incubation, the 

cocktaiVcell mixture is mixed once by end over end shaking. Following the 37«'C incubation, the plates 
are centrifuged at 2,000 ipm briefly to remove any liquid that may be on the cap mat surface. The plates 
are then incubated at 9B''C in a thermocycler with the lid temperature on for 10 minutes. Following the 
98°G incubation, the plates are cooled to 4°G, mixed and then centrifuged at 3,000 rpm for 10-20 
minutes. 5ul of the resulting supernatant are used as template for PGR reactions. 
5. DNA 

The techniques used to characterize the DNA sequence of the transposon mutants consists of two 
PGR reactions were previously described by Kolter et al. [3]. For the first round of amplification, 5ul of 
the InstaGene Lysis suspeniatant is used as the template. In the first round of amplification, the primer 
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unique to the transposon TNEim-lR (5'CTGTTTCAAAACAGTAGATG-3') is used for the 
Erythromycin transposon and TNCm-lR2 (5'GATAGGCCTAATGACTGGC-3*) is used for the 
Chloramphenicol transposon with arbitrary primer arb-8 (5'- 

GGCCACGCGTCGACTAGTACNNNNGATAT-3'). This first amplification conditions are 1 minute 
at 94°C, followed by 6 cycles (30 seconds at 94°C, 30 seconds at 30°C, 2 minutes at 72°C) and 30 
cycles (30 seconds at 94°C, 45 seconds at 45*'C, 2 minute at 72°C). The first PGR products are used for 
the second amplification. The primers used in the second are TNErm-2R 

(5'CAACATGACGAATCCCTCCTTC-3') orTNCm-2R2 (5'-GTCGGTTTTCTAATGTCACTAACG- 
3') for the erythromycin or chloramphenicol transposons respectively, plus an arbitrary primer arb-tail 
(5'-GGCCACGCGTCGACTAGTAC-3')- For flie second, PGR, 5ul fi^om the first amplification round 
are used for template. The amplification conditions for the second PGR were 1 minute at 94°G followed 
by 30 cycles (30 seconds of 94"'C, 45 seconds at 50°C and 1 minutes at 72°C). The PGR product from 
fte second amplification was purified prior to sequencing by treatment with SI nuclease and Shrimp 
Alkaline Phosphatase SAP (Roche). For this.lOOul sinculease/SAP was added to lOul PGR product. 
The SI /SAP mixture was incubated at 37°G for 20 minutes followed by a 15 minute incubation at 80°G. 
7ul of the SI /SAP products were sequenced on an ABI 377 using the primer firom the secondary PGR, 
TNErm-2R or TNGm-2R2. 
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(References relating to foregoing protocols: 

1) S. Schenk and Richard A. Laddaga 

Improved method for electroporation of Staphylococcus aureus. 
FEMS Microbiol Lett. 1992 Jul 1;73( 1-2): 133-8. 
PMBD: 1521761 [PubMed - indexed for MEDLINE] 

2) Ginger Rhoads Kraemer and John J. landolo 

High-Frequency Transformation of Staphylococcus aureus by Electroporation. 
Current Mibrobiol. 1990 Vol. 21 Pp. 373-376 

3) Geore A. O'Toole and Roberto Kolter 

Initiation of biofilm formation in Pseudomonas fluorescens WCS365 proceeds via multiple, 

convergent signalling pathways: a genetic analysis. 

Mol Microbiol. 1998 May;28(3): 449-61. 

PSMID: 9632250 [PubMed - indexed for MEDLINE]) 

Transposon insertions are generated using the above-described methods in S, aureus. The 
pMOD, pMOD (Erm-1) and pMOD (Can) plasmids referred to in the described methods are contained 
in Figures 3, 4 and 5 respectively. The sequences for these plasmids are contained in Figure 6 (SEQ ID 
NO: 1), Figure 7 (SEQ ID NO: 2) and Figure 8 (SEQ ID NO: 3) respectively also available at 
yyww.epicentre.com/sequences.asp Epicentre DNA sequences . Using these methods >7400 transposon 
mutants are generated. 

High-Throughput Transposon Insertion Mapping (HTTIM) 
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Precise transposon insertion sites are determined by an anchored, semi-random PGR method for 
amplification of the transposase/genome junction region. (O'Toole and Kolter, 1998, Initiation of 
biofilm formation in Pseudomonas fluorescens WCS365 proceeds via multiple, convergent signaling 
pathways: a genetic analysis, Mol. Microbiol. 28(3): 449-61). The technique, HTTIM, uses both Tn5 
specific and semi-random primers with conserved primer tails. A small aliquot of transposon mutant 
liquid culture is used as a template and amplification of a fragment containing an insertion site is 
achieved in a two-step process. The PGR product is then sequenced and the insertion site is entered into 
an Oracle database for analysis. To date, about 7,000 insertions have been mapped, each insertion 
representing the disruption of a gene or intergenic region that is not essential for survival on rich media. 
Of these, :r.7000 (6977) mutants are analyzed. Of these, about 6250 (6247, 89.5% total) have 

Tn5 sequences trimmed off. The mutants which map to a COL comprise about 5600 (5609, or about 

80.3 % of total). The mutants which correspond to a unique restriction site are about 5000 (4980, which 

corresponds to a sib rate of ~ 1 1 .2% of total). 

The mutants which map to an ORF are about 4650 (4651). Of these, 1404 ORF's are disrupted 

(51.2% of total). Of the mutants analyzed, 140 map to rDNA and 818 (14.6% of mapped mutants) are 

intergenic mutants. 

Further, the analysis revealed a total of 2387600 bp of GOL in ORF's or rDNA (15.0% 
intergenic regions). 

With every insertion added to the map, the regions of the genome containing essential genes, and 
particularly those containing operons containing essential genes (because of potential polar effects of 
insertions in upstream genes), begin to become apparent because these regions will not be able to 
accommodate transposon insertions. Table 1 shows a listing of the open reading frames identified as 
existing between transposon insertions, with an assigned probability of essentiality according to the 
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length of the putative open reading frames. These open reading frames cane be subjected to ftirther 
analysis. For instance, the predicted ORFs can be examined individually for (1) identity with known 
genes of 5. aureus with sequences deposited in GenBank, (2) similarity with well-characterized genes 
from other bacteria, or (3) presence of known functional motifs. 
Statistical Analysis of Putative Essential and Important Genes 

Probability correlates with length of the ORF, such that the longer the ORP, the higher the 
probability of hitting the ORF in a random transposon mutagenesis experiment, and the higher the 
confidence level that the ORF represents an essential or an important gene given that no transposon 
insertions therein were isolated. Statistical confidence levels in essentiality or importance can help 
narrow the focus in the screening of specific genes, thereby shortening the verification process and the 
subsequent identification of antibacterial agents specific for that gene or gene product. Thus, one of the 
benefits of the HTTIM approach is that it is a quantitative approach that lends itself well to statistical 
analysis. 

The High-Throughput Transposon Insertion Mapping (HTTIM) strategy utilizes a transposon, 
which is a small, mobile DNA element that randomly inserts into the chromosome. Any transposon may 
be employed so long as its insertion into the chromosome is random, i.e., devoid of hot spots. 

When the transposon insertion disrupts one of the essential genes in the Staphylococcus genome, 
the function of that gene is lost. If the disrupted gene is essential for growth, the transposon insertion 
mutant dies and cannot be characterized. If the transposon disrupts a gene that is non-essential, the 
mutant survives, grows and the transposon insertion site is mapped. By examining the insertion sites of 
a large number of transposon mutants, all, of the non-essential S. aureus genes can be identified, and by 
implication, all of the essential genes may be identified as well. Characterization of about 7000 
transposon insertions revealed insertions in essential genes and resulted in an even distribution of 
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insertions across the entire length of the genome. The remaining essential genes, in which a transposon 
insertion has never been observed, are candidates of essential genes (48.8%). 

Because insertion of the transposon used here into the chromosome was proposed to be random, 
it was possible that some of the Staphylococcus aureus genes that did not receive a transposon insertion 
were simply not hit by random chance. One cannot truly know that a transposon has no hot spots and is 
entirely random until the data is analyzed, and the data here confirmed that the transposon derivative 
employed underwent random insertion in S. aureus. Thus, the chance that a gene will not be hit by the 
transposon as a matter of random chance increases as the length of the gene decreases, particularly for 
very small genes (< 600 base pairs). 

A Bayessian statistical model for truncated counting data is applied to the candidate essential 
gene set, and permits a determination that 37% percent of S. aureus genes are essential. Such a model 
may therefore be utilized to increase the statistical confidence that a given gene in the candidate subset 
is essential. An exemplary statistical model is provided in Example 1. 
Physical Methods for Target Gene Validation 

While the above methodology and the database of putative essential and important gene 
candidates established thereby is believed to be superior to existing methods with regard to the quantity 
of experimentation required to identify essential and important genes in S. aureus and the degree of 
confidence conferred, it should be understood that the methodology described herein can be 
incorporated into combined protocols with technology known in the art. For instance, the methods for 
verifying essentiality disclose in WO 01/07651, herein incorporated by reference in its entirety, would 
be useful as a secondary method to be utilized in combination with the methods described in this 
disclosure. Alternatively or additionally, one of several approaches may be used to determine whether a 
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particular gene is essential (absolutely required for survival on rich medium) or important (the absence 
of which results in attenuated grov^) to S. aureus. 
Integration Knockouts 

In one preferred embodiment of the invention, target validation is accomplished by use of 
integration knockouts. Methods of generating integration knockouts are known in the art. In one 
method, PGR is used to amplify a small (200-500 base pairs) portion of the coding sequence, or open 
reading frame (ORF) of the gene of interest This fragment should be centrally located within the ORF. 
It should not include either termini of the gene's coding region. This fragment is then cloned into a 
plasmid vector that cannot replicate in S. aureus. The vector should have a drug resistance marker that 
is suitable for selection in 5. aureus. Such a vector is then transformed into an electroporation 
competent strain of S. aureus, such as RN4220. 

Following electroporation, the culture is plated on media which selects for S. aureus that contain 
the plasmid, and colonies that arise are the result of homologous recombination between the S. aureus 
and the cloned gene fragment on the plasmid. This is referred to as single-crossover recombination; a 
' single recombination event takes place between the plasmid and the chromosome. This results in the 
integration of the entire plasmid into the 5. aureus chromosome and the disruption of the gene from 
which the fragment is amplified (Fig. 1). 

Variations of this approach are also possible. For instance, one could clone out the entire locus 
and isolate transposon insertion mutants in E, coll Then, using general molecular biology techniques, 
i.e. by transposition from the £. coli genome, one can select plasmid insertions by transferring the vector 
into a recipient cell that does not contain the transposon or the antibiotic resistance marker encoded by 
the transposon. The plasmid would then be analyzed for insertions in the cloned gene. Thereafter, a 
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similar assay could be perfonned by screening for double crossover events in S. aureus that result in 
recombination of the transposon into the chromosomal locus from the suicide vector. 

Litegration of the plasmid, or other insertion at the locus, can be confirmed by a relatively rapid 
PCR-based screen of the resulting recombinant clones. The advantage of this strategy, particularly the 
plasmid single crossover strategy, is that it requires only amplification of a short stretch of DNA 
followed by a single cloning step before recombination experiments can be performed- The 
disadvantage is that if the target gene is essential, no recombinants can be obtained. Failure to obtain 
recombinants as proof of essentiality is only suggestive evidence for essentiality. However, if a gene is 
in fact non-essential, this method will demonstrate that quickly. 
Jntegratiott Knockouts with Extra-chromosomal Complementaion 

In another embodiment of the invention, target validation is accomplished by use of integration 
knockouts with extra-chromosomal complementation. The method provides more convincing data' when 
the target gene is essential. It employs the same type of non-replicating plasmid as described above, but 
recombinations are perfonned in strains already carrying a second copy of the target gene on an extra- 
chromosomal plasmid. This second copy can then supply the essential function when the chromosomal 
copy is disrupted. If disruptions can only be obtained when a complementing plasmid is present and not 
when a control plasmid is present, this is strong evidence that the target gene i? essential. The advantage 
of this method is that you obtain colonies even when your gene of interest is essential. The disadvantage 
is that construction and sequencing of the complementing plasmid takes additional time. 
Integration with a Regulatable Promoter (Promoter Swap) 

In yet another embodiment of the invention, target validation is accomplished by use of 
integration with a regulatable promoter (a promoter swap). This approach also involves selecting for 
chromosomal integration of non-replicating plasmids via homologous recombination. However, the 
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design of the integrating plasmid is different. In this case, the 5*300-500 base pairs of the coding 
sequence of the target gene is PGR amplified and cloned into a vector downstream of a regulated 
promoter, i.e. a tet. xyl. or spac promoter, which is inducible in the presence of anhydrous tetracycline, 
xylose, or IPTG, respectively. The activity of the promoter can be modulated by the presence of a 
specific inducer molecule. The plasmid is electroporated into S. aureus and integration events selected 
for under conditions where the regulatable promoter is active. The resulting chromosomal integration 
replaces the target gene's natural promoter with the regulatable promoter fi^om the plasmid (Fig. 2). If 
the target gene is essential, recombinants can only survive when the inducer molecule is present in their 
growth media to stimulate expression of the target gene. If the gene is non-essential, the recombinants' 
growth is independent of the addition of the inducer. The advantage of this strategy is that it requires 
only amplification of a short stretch of DNA followed by a single cloning step before recombination 
experiments can be performed. 
References: 

1 . Lana Kim, Axel Mogk and Wolfgang Schumann. 1996. A Xylose-inducible Bacillus subtilis 
integration vector and its application. Gene 181: 71-76 

2. Bateman, B. T., N. P. Donegan, T. M. Jarry, M. Palma, and A. L. Cheung. 2001 . Evaluation of a 
Tetracycline-inducible promoter in S. aureus in vitro and in vivo and its application in demonstrating 
the role of sigB in microcolony formation. Infection and Immunity. 69 (12): 7851-7857. 

3. Yansura, D., and D. J. Henner. 1984. Use of the Escherichia coli lac repressor and operator to 
control gene expression in Bacillus subtilis. Proc. Natl. Acad. Sci. USA 81 : 439-443. 

Accordingly, the invention includes a method for identifying an essential or important gene in a 
Staphylococcus genome comprising generating random transposon insertions in a Staphylococcal 
genome and screening the screening the mutants for essential and important genes. 
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Preferably, the method for generating random insertion into a Staphylococcal genome comprises 
subjecting Staphylococcal cells to random mutagenesis and culting the mutagenized cells in a recovery 
broth. Preferably, the recovery broth is B2 broth. 

The method may further comprise validating the identification of an essential or important gene 
by use of one or more confirmation processes. Such confirmation processes include, but are not limited 
to confinnation by use of integration knockouts, confirmation by use of integration knockouts with 
extra-chromosomal complementation, confirmation by use of integration with a regulatable promoter 
(promoter swap). 
LIST OF EMBODIMENTS : 

1 . An isolated nucleic acid molecule comprising a nucleic acid sequence encoding a polypeptide 
having at least 80% sequence identity to a polypeptide encoded by a nucleic acid sequence selected firom 
the group consisting of the Staphylococcus aureus open reading jframes (ORFs) listed in Table 1. 

2. An isolated nucleic acid molecule comprising a nucleic acid sequence encoding a polypeptide 
having at least 80% sequence identity to a polypeptide encoded by an essential or important nucleic acid 
sequence selected fi^m the group consisting of the Staphylococcus aureus open reading frames (ORFs) 
listed in Table 1, wherein said essential or important nucleic acid sequence is identified as being 
essential or important by integration knock-out coupled with extra-chromosomal complementation. 

3 . An isolated nucleic acid molecule comprising a nucleic acid sequence encoding a polypeptide 
having at least 80% sequence identity to a polypeptide encoded by an essential or important nucleic acid 
sequence selected from the group consisting of the Staphylococcus aureus open reading frames (ORFs) 
listed in Table 1, wherein said essential or important nucleic acid sequence is identified as being 
essential by integration of a regulatable promoter into the gene. 
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4. A method of screening for an antibacterial agent, comprising determining whether a test 
compound is active against the bacterial gene of embodiment 1 . 

5. A method of screening for an antibacterial agent, comprising determining whether a test 
compound is active against the protein encoded by the bacterial gene of embodiment 1 . 

6. A method of screening for an antibacterial agent, comprising determining whether a test 
compound is active against the essential or important bacterial gene of embodiment 2. 

7. A method of screening for an antibacterial agent, comprising determining whether a test 
compound is active against the protein encoded by the essential or important bacterial gene of 
embodiment 2. 

8. A method of screening for an antibacterial agent, comprising determining whether a test 
compound is active against the essential or important bacterial gene of embodiment 3. 

9. A method of screening for an antibacterial agent, comprising determining whether a test 
compound is active against the protein encoded by the essential or important bacterial gene of 
embodiment 3, 



1 0. The method of embodiment 5, comprising the steps of: 

a) contacting said protein or a biologically active fragment thereof with a test compound; and 

b) determining whether said test compound binds to said protein or said fragment; 

wherein binding of said test compound to said polypeptide or said fragment is indicative that said test 
compound is an antibacterial agent. 

1 1 . The method of embodiment 7, comprising the steps of: 

a) contacting said protein or a biologically active fragment thereof with a test compound; and 

b) determining whether said test compound binds to said protein or said fragment; 
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wherein binding of said test compound to said polypeptide or said fragment is indicative that said test 
compound is an antibacterial agent. 

12. The method of embodiment 9, comprising the steps of: 

a) contacting said protein or a biologically active fragment thereof with a test compound; and 

b) determining whether said test compound binds to said protein or said fragment; 

wherein binding of said test compound to said polypepltide or said fragment is indicative that said test 
compound is an antibacterial agent. 

13. A method for evaluating a test agent for inhibition of expression of the gene of embodiment 1 , 
comprising: 

a) contacting a cell expressing said gene with said agent; and 

b) determining the amount or level of expression of said essential gene in said sample. 

14. A method for evaluating a test agent for inhibition of expression of Ae essential or important 
gene of embodiment 2, comprising: 

a) contacting a cell expressing said essential or important gene with said agent; and 



b) determining the amount or level of expression of said essential or important gene in said 
sample. 

15. A method for evaluating a test agent for inhibition of expression of the essential or important 
gene of embodiment 3, comprising: 

a) contacting a cell expressing said essential or important gene with said agent; and 

b) determining the amount or level of expression of said essential or important gene in said 
sample. 

16. The method of embodiment 13, wherein said level of expression is measured by measuring the 
amount of expression product in said cell relative to a cell that has not been contacted with said agent. 
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17. The method of embodiment 1 3, wherein said level of expression is measured by measuring the 
level of expression of a gene fusion to said gene relative to a cell containing said gene fusion that has not 
been contacted with said agent. 

1 8. The method of embodiment 1 3, wherein said level of expression is measured by measuring the 
level of expression of a protein fusion to said gene relative to a cell containing said protein fusion that 
has not been contacted with said agent. 

19. A method for evaluating an potential antibacterial agent, comprising the steps of: 

a) providing a bacterial strain comprising a mutant form of the gene of embodiment 1, wherein 
said mutant form of the gene confers a growth conditional or attenuated growth phenotype; 

b) contacting bacteria of said bacterial strain with said test compound in semi-permissive or 
permissive growth conditions; and 

c) detemiining whether the growth of said bacterial strain comprising said mutant form of a gene 
is reduced in the presence of said test compound to a greater extent than a comparison bacteria 
comprising a normal form of said gene. 

20. A library of nucleic acid sequences consisting essentially of nucleic acid sequences having at 
least about 80% protein sequence identity to a nucleic acid sequence selected from the group consisting 
of the Staphylococcus aureus open reading frames (ORFs) listed in Table 1, wherein said library of 
nucleic acid sequences is employed to identify essential genes in Staphylococcus. 

21 . A map of at least about 500-1 500 transposon insertions in the genome of Staphylococcus aureus, 
wherein said map is useful for identifying genes that are essential for survival of said Staphylococcus 
aureus. 

22. A vector comprising a promoter operably linked to the nucleic acid sequence of embodiment 1 . 
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23. The vector of embodiment 22, wherein said promoter is active in Staphylococcus aureus, 
Escherichia coli. Pseudomonas aeruginosa. Hemophilus influenzae. Neisseria gonorrhea. Klebsiella 
pneumoniae, and Streptocooci. 

24 . A host cell comprising the vector of embodiment 22 . 

25. A fragment of the nucleic acid of embodiment 1 , said fragment comprising at least 1 0, at least 
20, at least 25, at least 30, or at least 50 consecutive bases of said nucleic acid. 

26. A protein having at least about 80% sequence identity to the protein encoded by the nucleic acid 
of embodiment 1. 

27. A protein having at least about 80% sequence identity to the protein encoded by the nucleic acid 
of embodiment 2. 

28. A protein having at least about 80% sequence identity to the protein encoded by the nucleic acid 
of embodiment 3. 

29. An antibody or antibody fragment capable of specifically binding the protein of embodiment 26. 

30. An antibody or antibody fragment capable of specifically binding the protein of embodiment 27. 
An antibody or antibody fragment capable of specifically binding the protein of embodiment 28. 

32. An agent identified as having anti-bacterial activity by any of the methods of embodiments 4-19. 

33. A method for inhibiting the growth or survival oi Staphylococcus aureus comprising contacting 
said bacteria with the agent of embodiment 32 so as to inhibit growth or survival. 

34. A pharmaceutical composition comprising the agent of embodiment 32, 

35. A method for treating a patient having a Staphylococcus aureus infection, comprising 
administering to said patient an amount of the agent of embodiment 32 effective to reduce or inhibit 
growth or survival of said Staphylococcus aureus. 
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36. A method of protecting a patient against a Staphylococcus aureus infection, comprising 
aininisteringto said patient an amount of the agent of embodiment 32 effective to prevent said patient 
from acquiring a Staphylococcus aureus infection. 

37. The isolated nucleic acid molecule of embodiment 2. wherein said nucleic acid contains an 
essential gene. 

38. The nucleic acid library of embodiment 20, wherein said map is in electronic form. 

39. The library of embodiment 39. wherein said electronic form is selected from the group consisting 
of magnetic storage media, such as a floppy disc, a hard disc storage medium, and a magnetic tape; 
optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; hybrids of 
these categories such as magnetic/optical storage media; computer readable forms such as a word 
processing text file, database format, searchable files, executable files and search program software. 

40. The transposon insertion map of embodiment 21. wherein said map is in electronic fonn. 

41 . nie map of embodiment 38, wherein said electronic form is selected from the group consisting 
of magnetic storage media, such as a floppy disc, a hard disc storage medium, and a magnetic tape; 
optical storage media such as CD-ROM; electrical storage media such as RAM and KUM; nyonas oi 
these categories such as magnetic/optical storage media; computer readable forms such as a word 
processing text file, database format, searchable files, executable files and search program software. 

42. A method for identifying a library of putative essential or important genes using a High 
Throughput Transposon Insertion Database (HTTIM), comprising: 

(a) mutagenizing a Staphylococcus genome with a transposon such that individual cells 
containing at least one transposon insertion are isolated; 

(b) collecting and mapping said at least one transposon insertion in each individual cell so as to 
form a database of transposon insertion sites, or an HTTIM; 
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(c) comparing said database of transposon insertion sites with a database comprising the genomic 
sequence of the bacterium to identify open reading frames in said genomic sequence database that are 
not disrupted by a transposon insertion; 

(d) fomiing a library from said putative essential or important genes that are not disrupted by a 

transposon. 

43. The method of embodiment 42, wherein said bacteria is 5. aureus. 

44. The method of embodiment 42, wherein said transposon inserts randomly into the target genome. 

45. The method of embodiment 42, wherein said transposon is 3,000 to 6,000. 

46. The method of embodiment 42, wherein said HTTIM comprises at least about 4,000 to 5,000 
transposon insertion sites. 

47. The library of putative essential or important genes identified by the method of embodiment 42, 
wherein said library comprises at most about 500 to 1850 genes. 

48. The library of putative essential or important genes identified by the method of embodiment 42, 
wherein said library comprises at most about 1000 to 1400 genes. 

49. The library of putative essential or important genes identified by the method of embodiments 42, 
wherein said library comprises at most about 600-625 genes. 

50- The library of putative essential or important genes identified by the method of embodiments 42, 
wherein said library comprises at most about 530-543 genes. 

5 1 . The method of embodiment 42, further comprising a statistical calculation for identifying 
putative essential or important genes. 

52. The method of embodiment 5 1 , further comprising the statistical method applied herein. 

53. The method of embodiment 42, further comprising a physical mutagenesis experiment in order to 
verify essential or important graes. 
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54. The method of embodiment 53, wherein said physical mutagenesis comprises knocking out a 
putative essential or important gene or creating a promoter swap mutant. 

55. An essential or important gene identified by the method of embodiment 53. 

56. An antibacterial agent that targets the gene of embodiment 55. or the gene product encoded by 
said gene. 

57. A phaimaceutical composition comprising said antibacterial agent of embodiment 56. 
Examples: Essentia l Genes Identified 

Example 1: A Bayessian Statistical Model for Increasing Statistical Confidence of Essentiality 

A Bayessian statistical model for truncated counting data was applied to the candidate essential 
gene set, and permitted a determination that about 37% percent of S. aureus genes are essential. This 
model may therefore be utilized to increase the statistical confidence that a given gene in the candidate 
subset is essential, by the following rationale. For a given set of genes, the percentage of nonessential 
genes is independent of gene size. For a fixed gene size S, the observations X,, Xa. Xn are 
Poisson(XS), of which all observations of value zero are missing. Let {x;,x;,",x;}£{x,,Xj,- -.Xn} be 
the subset of all nonzero observations. Then the subset {x;.x;..-.x;} composes a random sample of size 
n from a truncated Poisson distribution and the likelihood function of the joint distribution of 
{ « V X- ... X' i conditional on the total number of nonessential genes, N, can be obtained as follows 



where s = x;+x;+—+x; and N is the number of nonessential genes of size S. 
The Bayesian model consists of the conditional model and a prior distribution on the parameter 
N. Assume N, the number of nonessential genes, is distributed as binomial BCbA, y) with M being the 
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total number of genes of size 5, and y is the proportion of nonessential genes which is an unknown 
constant and is independent of gene size. The likelihood function of the joint distribution of 
{N,n.Y,x;,x;. --.x:} can be written as 



Let 8 = (5„6j,- -.Sg)^ be a vector of g different gene sizes and M = (M,.M,. -.M,)^ be the vector 
of known numbers of total genes, N = (N.,N,.-.N,)^ be the unknown numbers of nonessential genes, 
s _ « T. be the vector of nonzero observations from the nonessential genes, and 
S = (S„Sj,— .Sg)"' be the sums of nonzero observations. The likelihood function of the joint distribution 
of (n, n, Y, S } can be written as 



Where is the Li norm of a vector, and ^t.n)= J5,N, • 

Up to an additive constant, the log-likelihood function of the joint distribution of {N,n,y,S } can 
be written as 

3(7.3^n)=1| N II, .\n{M\ M II. "H N ||.) ln(l-y>|| §«. .ln(X)-X.^-.N)-t ln((M -N,)!)-|ln((N,-n,)!)and the 
maximum likelihood (ML) estimators of the parameters y and X are 
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Y = ||N||,/||M||, and X=BS||,/^'' n) 

However, when g is large, say, in the order of hundreds, as in the present disclosure, obtaining the ML 
estimator of the parameter vector n = (N,.N,. --.N,)^ in a high dimensional parameter space is a 
challenging problem. A searching algorithm was developed to find the maximum likelihood estimator as 
fl = fie K* . Where ©, an operator between the observed vector n and any integer o s k sll M ||, - 1| fi ll, 

defined as follows: 

n®0 = n, 

n® 1 = ^+ Tj : Aj 3'(n) > A, 3*(n) for all i ^ j ). 
n®k = (n®(k-l))ei fOTkS:2. 

and K* = inax{k' S 0 : G(k)> 0 for all 0 < k < k* }. 

As a result of this modeling, we were able to estimate that 16 to 17 percent of the genes are 
essential. 

AiternatTvdy^a-stepwiseTOax^n«ml-K^^e^ih©od-(-^fl=)-g^ 

estimator as follows. For any n =(N„N„ -.N,/. it is easy to verify using (2.7) that the ML estimators of 

the parameters y and X are 

■ f=l|NII,/llM||, (3-^) 

and 

X=||S||,/0-^n) (3-2) 
respectively. Substituting (3.1) and (3.2) for y and A, respectively, in (2.6), we have 
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3'(N) «ll N Oi • M\ N II, )+ (ll M Di - II N D.)* Indl M «, - II N n, ) (3 3) 
-IISII, .ln^^.N)-t(lii((Mi-NO0+M(Ni-ni)0)' 



Define 



A,3-{n)=3-(n+T,)-3-(n) (3-4) 

foranyi6{l,2,...,g) and n^K <N, <M,.njS>JjSM,:j*i}- Where 1, =(o oxo^-of with 1 at the! 

position. For notational purpose, let 

i7(ft)= k ■ ln(ifc)+ dl M II, -A:)- ln(il M ||, -*) O -5) 

for II H 11,5 k <11 M II, • Then, (3 .4) can be written as 



th 



A, 3*(n)= 4\ N II. +0- 4\ N II,) (3 6) 

H|Sl|,.n(,.6yt5^.N)).ln[3Mi=^] 



To obtain ML estimator of N, we define an operator, denoted as ®, between the observed vector ii and 
any integer o s k s|| M ||, - 1| 5 1|, as follows: 



ri®\ = te+ 1, : A,3'(n)> A, Z'(n)fac all i * j |, and ^^-'^ 
5®k = (i)©(k-l))©l fork&2. 
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We also define a likelihood-gain ftmction <? as 



G(0)=0 (3.8) 
G(k) = 3* (nek)- 3* (fi© (k- l)X for 1 S k <ll Jiir II, - II n ||, 



THEOREM l:if 




(3.9) 



then G(1)>0. 

Proof: If G(l) sO, then by (3.5), 



Ai3*(5);S0 foralll^iSg 

> il(ili5ll,+l)-^(ilSll.)-WSll, •ln(l + S,/^^-n))+ln(M,-n,)<0 



^^j^nfi ^ (II n II. -Hlf • (ll M II. - II n H. -if 

(ii«ii.r-(ii^ii.-ii«ii.r"" 



Add the 2 sites up over i, we have 



l|Mi|,-&nli,-l 



Using the factors that, for any x>0, (l + Ijxy < e , (l + llxf*' > e , and (i - \]xy-' > e" , we obtain 
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Jexp[M|iV||n||,.e.e-H|fi|l. 



which is contradiction to the condition (3.9). 

When grh the condition (3.9) becomes (x.+.-h-XJ/t,. Hence, this theorem says, on average, 

when the mean count is less than the natural logarithms of the number of nonzero observations, the 
vector ii can not be the ML estimator of N . In another word, when the mean count is not too large, there 
must have some missing observations firom nonessential genes. 



THEDREM'21 

A,3*(n)>A,3*(n-1j) foralli^j (3-10) 



Proof: By definition in (3.5), 



dh(x^-l)-ii(x)] _ fxli. l|M|l,-x V 0 
dx t X |lMB,-x-lJ 
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A, 3-(n)- a, 3'(n- Ti)= (n(il N II. +l)- 4\ N II.))- (4 N II.)- 4 N 11. -l)) 
-ilSII, -4 + 6, /^^•n))^ US II, Inll + S./^^N-Sj 
>ll S y. \\4 * 5,/^^- N- 8 J)- ln(l + 8,/^'- n))1> 0. 

Define 

K"=max{k*SO: G(k)& 0 for all 0 < k <k*}. (3.11) 

THEOREM 3: Under (3 .9), for any i s j s g and i s k s K' , if N = n® k-Tj e { n, i s M j}. 



3*(5®k)>3'(n®k-lj) (3.12) 

Proof: This is obviously true when k=l. Assume (3.12) is right for integers 1,2,..., k. For integer k+1, 
we have 



3*^©(k+l)-Tj)-3*(ii®k) 

= [3-(n®(k+l)-lj)-3-(5®k-Tj)]+t5*(5®k-l)-3*(8®k)] 
< [3*(5® (k+l)-Tj)-3*^©k- ij] 



By Theorem 2, 

3*(ne(k+l)-lj)-3*^ek-T,)<3*(n®(k+l))-3'(n®k) 



Therefore 
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3* (n© (k+ 1)) > 3* (ne (k+ 1)- \) 

Combine Theorems 1-3, we obtain ML estimator of n as: 
fl = neK' (3.13) 

Example 2: 
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1 SANUMBEBt J- GENE • NAME.'i^i-^^.; 


SIZE- SI 


PROBABILIiaf^'USs^i L0WE8§!^f • | 


.SA0001 


dnaA 


looo 


0.984672 


0.9807534 


SA0002 


-dnaN 


1 1 30 


0.9690877 


0.9626406 


•SA0003 




242 


0.5250502 


0.5053862 


SA0005 


gyrB ^ 


1931 


0.9973706 


0.9963655 


SA0016 


dnaB 


1397 


0.9864051 


0.9828177 


SA0017 




89 


0.2395316 


0.2281006 


SA0019 

:SA0026 


yycL 


698 

V253 " 


0.8832234 
' 0.978827 


0.8687278 
'0.973878 


SA0027 




236 


0.5162013 


0.4966774 


•SA0028 _ 




671 


0.8731086 


0.858001 5 


SA0029 
•SAOOSq 
SAOOfr ^ 
SA0b32 " 


. 

.maoC 


164 

227 
740 


_0.3962339 

' 0.6626178 

0.8973789 

0.7295237 
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i S a6664 ' 632 0.8569319 0.8409423- 

isAoeee • - • • 947 - 0.9457187 0.9363797' 

fSAOeeT 716. 0.8895146 0.8754246 

I3A06 ' 76 ■ 713: 0.8884901 0.8743327. 

^SM^ r^ 785. 0.910647 0,8980798 ; 

rSA06 72 isarA " 371 0.6806387 0.6601459^ 

I3A0673 116 0.3001504 0.2864081 : 

jsA6'674 'Z^ZZIZIIZL 1 146 . 0.3618545. 0.3460432 

1 5^067 6" " 1^ " 221 ' "0.4933 51' 0.474228 7: 

'rS A0680 422 0.7270157 0.7070045; 

iSA0683 ~~ZllZlllZr- 0.4426495 0.4246116 

rSA0685 " 299 0.60144 69. 0.5809609 

^AOe se IL 434 0.7369104 0.7170559 

j3A0 688 926^ 0.9420958: 0.9323721. 

Is Aoesg 633 0.9229144 0.9113622 

iSA0690 , 740; 0.8973789 0.8838253 ' 

iSA0694" * 791 0.9122813' 0.8998433. 



6 



wo 2004/018624 



Table I 
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fSANUMBERTT^ 

•SA0695 830 _ _ 0.922 1996 0.9105853 

' SA0696'* ' * "~^ag'B_^ 1 100 " ' 0.96 60987 7. 0.95923 37 

.SA0698 tagb '"395* ' ' 0.7033706 0.6830 639 

,SA0702 " ' "89 ' bJ2395 316 0.228100 6 
:SA07b3 _ ' _??P " . . 9J^22199^^_ 

SA07b7 ' 965 I '3.948643 " " " 0.93 96253 

SA0708 ' " ' '581 'a8326263 " 0.8155043 

SA0709 359 0.6686277 _ 0.648072 8 
SA0710 494 ^"q.78125^^^^ I"_._9Z§?5713 
!SA0711 " 1064 *" " a962V279" ^' " 0.954733 

.SA0713 ■ - - - • '^25 " '"5*7295237 * * 0.7095503 

SA0714 503 0.787231 P-7685119 

■SA0716 Jsd _ 0.876574 " 6.861670 9 

SA0717 ' J ljP37 6.9588476 0.9510 342 

'SA0721 -- ---- ■ 6.84878M ' ' 0^2391 

;SA0722 - - - - 1004 0.9544499 "0_.946lji)06 

. SA0728 470 _ _ 0.7644 94 T i^-Tf? 1882 

1SA0729 ^ lil6 7. 913001504 P^§^^.k^ 

'SA0730 " " ~ 6!860839i 0.84 5052 5 

SA0731 863 0.9297108 .. 0-9187697 

.SA0732 ""i *"22l'_" . P -4^351^^^^^ 

:SA0736 ' Z " 6.74 1 7224 ' " " 6.72"l 951 6 
♦SA0737 " 392 6.70062 0.6802859 

;SA0738 " 2 ' "2 96' 0.5977512" 0.5772879 

I SA0739 'I "' 539 6.8695389 "' "0.791528 1 

^A0741 [ ] 455 0.7533709 0.7338233 

i'Sa6742 680 0.876574 0 . 8616709 ; 

i"SA6746 • 440 " 0.7 41 7224 0.7219516 

!Sa6755 ' * 461 0. 7578819 0.7384289 . 

rSA677l Z. " 11§ 0.7 219296 0.7018457. 

"SAOm' ^ 1937 0.9974187. 0.9964284 

jSA0783 opuBB 1511' 0.9904269^ 0.987667 3' 

-^gA0784 ^ h is e " 10 56 — " 0.9 6^43646 Q^53532Z - 

1S A0786 '_ ^ 158 0.384985 0.3684777 ; 

^SAOTST " 914 0.93991 81 0.9299696 

iSAO TiT inrdi 395 0.7033706 0.68306391 

ISA Q793 nrdF . . „ 968. . 0.9491149 0.9401499 ' 

ISA0795 152; 0.3735265 0.3573583' 

iSA0797 _ 953 0.9467115 0.9374805 

!Sa6'799' " _ 1025: 0.9572999 0.9492947- 

ISAOeOI ' .mure 920 0.941017 0.9311813^ 

iSA6804 ' 317 0.622 9185 0.602338 ' 

iSAOSOS "IL 1121 0.9682 197 0.9616495 

ISAOSli" • * ' 671 0.8731086 0.8580015 . 

iSA68l"6" ^secA 2528 0.999581 0.9993599: 

ISA0819 ^ 104 0.2738294 0.2610581 * 

iSAOSp ZIZZZZIZIIZZ~-1 ' — - 233 0.511 7 152 0.4922657 ' 

i SAOSie" jlgt Z'_ 836 0.9236226 0.9121324 

iSA0829'"" [ 9^: 0.9431549 0.9335422- 

^A0831 Z_ 0.9527368 0.9441859 

!S7^0832 941. 0 .9447074 0.9352595. 

ISA6836 " ^ 143 0.3559372 0.3403112 
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SANjJMBER • I . GENE^NAME .lrrV 



SA0838 
^A0840 

!SA0843 
SA0844_ 
'Sa6847 
.SA0848 
'SA0849 
SA0850 
'SA0851 
S'a085? 
.SA0853 
!SA0865 
rSA0859 
'SA0861 
:SAd862 
.SA0863 
5SA0864 

•SAOsee 

•SA0867 
'SA0868 
;SA0869 
rSA0873 
iSA0874 
:SA0875^ 
'Sa6876~ 
i"SA08782 
rSA0880 
' iSA088ji 
•^0884^^ 

isAoess" 



!SA0886 



ISA0887 



ISA0888 



iSA0889 

:SA0890 

!SA089i 

iSA0892_ 

!SA0893^ 

ISA0894 " 

iSA0895 

[SA0896^ 

tSA6897* 

[SA0898' 

ISA0899 



.gap 

IpiA 

eno 

secG_ 
smpB 



•cspC 



aroD 



1007 
_758 " 
1301 ' 
___455 
272 7 
461 ' 
278"* 
320 ' 
143 
725 
"89* 
152 
527 
"467" 
197 
215 
1 28J, 
J566_ 

ibr 

143 
233 

587 

"713" 



536 



_ 0.9548684 
' 0l9029075 
6.9817338' 
0.75 33709 

" 0.5669254 

0.7578819 ' 
0.5748465 ' 
0.6263829 
L 0.3559372 
3 'a892532 ' 
' 6.2395316 
0.3735265 
0.8023757 ' 
^'0.7623104 
6!4545246' 
0.4839115 
J 6.5787526 _ 

2 ^""^0^.824721 

0;2676958 " 
0.3559372 
0.5117152 
... "o;8356876 

" 6.8884901 ' 

6.8077728" 



tQWER^.| 
0.946 5689 

" 0-8897519 

"0.9772823 
0.7338233 
6.5467212 
677384289 
"0.5545641 
0.6057J33 
0.^03112 

' 0.8786438 

""6T2281606 
0.3573583 
0.7841222 

'""6>429"547 
0.4362098 
0^649714 
0.5584345 

■ _6.8072756 
6;2545811 
^ 0.3403112 
0]^49226^ 

1. 6.8186965 
0.87433 27 
0.7897008 



317 



353 



^.6^9185 
0.6624538' 



0.602338 



0.6418764. 



110 



0.2871114 



0.2738437: 



383 



0.6922144 



0.6718049 



293 



0.5940213 



0.5736828- 



818 



0.9192736 



0.9074089 



1217 



0.976347 



.ent 

;sei 



725 



0.892532 



0.970994; 
0.8786438 



"725- 



-0:892532 — 0T87-86488r 



428 



0.7320087 



0.71207411 



455 



0.7533709 



329 



0.6365864 



0.7338233' 
0.6159801, 



260. 



0.5506376 



269 



0.5629096 



0.5306188; 
0.54274825 



143 



0.3559372 



0.3403112 



206 



0.4694214 



317 



0.6229185 



0.4507788 . 
0,602338 



779 



0.9089822 



0.8962853 



1454 



0.9885919 



359 



0.6686277 



281 



0.5787526 



0.985443 
0.6480728 
0.5584345 



1SA0900 



638. 



0.8595487 



338 



0,6465111 



0.8436944 
0.6259037 



!SA0902 




575 


0.8295079 


0.8122558 


!SA0903 




215 


0.4839115 


0.4649714 


ISA0904 




524 


0.8005432 


0.78223 


ISA0905 




338. 


0.6465111 


0.6259037} 


!SA0906 




566 


0.824721 


0.8072756: 
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1;. SANUMBEF^' J: GENE . NAME- Iv^ 






iSA0907^ 


seb _ _ 


797 


0.9138857 




''SA0908_* **" , 




554 


0.8181289 




^SA0969 




149 


0.3677175 




,SA0910 




107 


0.280501 


U.<^o/4 / 00 


•SAd9lV_ ^ 




197 


0.4545246 




SAMI? 




191 


0.4443617 


0.42o2oo 


"SA0914 




758 


0.9029075 


0.8897519 


rSA0916 




1238* 


' 6"9778269 


0.9727129 


:SA0918 




1394 


0.9862791 


0.9826671 


:SA09i9 




_J34 J 


b!3378541 


0.32281 18 


rSA0920 




""311 


0.615893 


U.oyooob«3 


SA0922 




1064 


" 0*9621279 


0.954733 


!SA0923 




95 


0.2534409 


0.2414564 


'SAb92V 




824 


0.9207501 


0.9090109 


:Sa6928" " 




389 


0.697844 


0.6774835 


.SA0929 




254 " ' 


"0.5422654 


0_^5223543 


:SA0933 




107 


'6.280501 _ 


0.2674788 


rSAb934 




""206' 


0.4595361 


0.4411085 


i'SA6935 " " ' ' 


''"ditA 


1454 


0.9885919 


/*\ A A^ 

0.985443 


:SA0936 


dltB 


1211 


679759064 


0.9704833 


•SA0937]^ 


dItC *" 


* 233 * 


O.51Y7152 


6.4922657 


;SAd938 ^ 


-dllD 


1172 


0.9728348 


0.9669372 


:SA0939 


-nilU-3 


239 


0.5206462 


0.5010508 


ISA0940 




320 


0.6263829 


0.6057933 


!'Sa6942 " 




233 


0.5117152 




ISA0943 




356 


0.665555 


0.6449881 


ISA0947 


jyuxO 


371 


0.6806387 


0.6601459 


ISA0949 


mnhG 


353 


0.6624538 


0.6418764! 


tSA0950 


:mnhF 


290. 


0.5902568 


0.5698451; 


ISA0952 


mnhD 


1493 


0.9898818 


0.9870043 


!SA0953 


mnhC 


338 


0.6465111 


0.6259037 


ISA0954 


imnhB 


425 


0.7295237' 


0.7095503' 


1SA0955 


imnhA 


24U2 


0:9993827— 


0:9990766-- 


ISA0961 


igluD 


1241 


0.9780306 


0.97295' 


}SA0965 




13r 


0.3317143 


0.3168761 


ISA0969 




581. 


0.8326263: 


0.81 55043! 


|SA0972 




104 


. . . 0.2738294 


.0.261 0581 1. 


ISA0976 




821 


0.9200152 


0.9082134 


ISA0977 




263. 


0.5547661 


0.5346972j 


ISA0982 




638 


0.8595487 


0.8436944' 


ISA0983 




1115 


0.9676276 


0.9609743. 


!SA0984 




512 


0.7930417 


0.7744939^ 


iSA0985 




431 


0.7344708 


0.7145759 


ISA0986 




182 


0.4287612 


0.41 10641. 


ISA0987 


.fabH 


938 


0.9441946 




iSA0988 


:fabF 


1241 


0.9780306 


0.97295! 


ISA0989 




368 


0.6776774 


0.6571671" 


tSA0990 




95 


0.2534409 


0.2414564 


iSA0996 




1712 


0.994842 


0.9931274 


ISA0997 




983. 


0.9514098 


0.9427053- 


!SA0998 




977' 


0.9505045 


0.9416965* 


iSA0999 




959 


0.9476862 


0.93856231 
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I, SANUJy^BER,-! GENE^NAME, 

SA1000 oppC 

*'SAl601 _ " lrp_S / " ] 

•"SA10b2"* 

rSA1004 1 ^ . . . . 

SA"i009 ' ' " _ J_ 

SAIO16 '6^^.* . 

tSA1011 
rSA1012' 

"SA1616 iabi '['1] 

*SA1019 .1..., . ! 

SAIO23' 
.SA1024 

:SAld27 ^ 

SA1028 " \ " 

'SAIO29' [ 

SA1030 

rSA1031 

SAi632 .'r_. . 

»*SA1033 
•SA1034 

:'SA1035 _ 

:SAld37 " "* ^^'^ 

i SA1 039 . . ~ . '""^ _ 
;SA104 1 " _ _ 

;SA1042 



SIZE. : ^.|-, PRQBABlLn^iJfk 
0.9328809 



878 
"986' 
'392 
983 
'344 * 
' 632 
806" 
851 
767 
755 ' 
1476 
257 
92 
1439 
857 
1355 
1508 
527 ' 
215 
983 
173 
' 104 
'317 ~ 
287 



0.9518563 



0.70062 

0,9514098 
'0.6529766^ 
o!8'5693j^9" 
br9T62375 
0!9i270673 
'b.9055591 
"b.9020072 
0.9894039 
0!5464708 
"b.2465183 
'0.9880531 
b!9'284013 ' 
0.9845299 
6.9903381 
1)78023757"" 
* b'4839115 
0^951 4098 
jo!4 127227 
^2738294 ' 
0.6229185' 
0.5864574 



feLoyvEg^ 

0.922 238 
'2 0^432031 
0.6802859 
0.9427053 
0.6323765 
2 0.8409423 
0"964Vl96 
0.915884 
0.8926008 
b;8887856 
0.9864247 
0.5265046 
0.2348076 
'b.984'7938 
0.9173395 
0.9805847 
0.9875592 
0y84T222 
0.4649714 
0.9j1.27b53 
0^39544 16 
0.2610581 
076b23"38 
" 6!5660748 



iSA1045 



ISA1046 
1^047 

isAiosb 



tSA1051_ 
:SA1054 



iSA1055 



284 


0.5826228 


0.5622713 


956 


0.9472011 


0.9380237 


128 


0.3255175 


0.3108884 


287 


0.5864574* 


0.5660748! 


101 


0.2670958 


0.254581 r 


1208 


0.9756829 


0.9702246. 


818 


0.9192736 


0.9074089, 



326 



0.6332166 



0.6126141; 



TSA-1t)60- 



"4-16- 



-077219296 



-^r7-04^8457^ 



ISA1061 



107 



0.280501. 



0.26747881 



iSA106 5 
ISA1067 
ISA1072 



1214 



0.9761277: 



0.9707398 



275 



0.5709042 



0.5506598. 



folD 



857 



0.9284013:- 



0.9173395 



1SA1073 

ISAIO74"" 

iSA1075" 



purE 



416 



0.7219296 



0.7018457' 



purK 



1121 



0.9682197 



0.9616495 



:purC 



701 



0.8842963 



0.8698684 



!SA1077 
tSA108 1 
iSA1084 ' 

iSAiose' 

iSAIOOO 



■ purQ 



668 



0.871932 



0.8567569 



purN 



563 



0.8230957 



803 
"572 



0.9154608 



0.827927 



539 



0.8095389 



0.8055863 : 
0.9032792 
0.81 061 02 - 
0.7915281 



1SA1091 



263 



0.5547661 



0.5346972! 



{SA1093 



230 



0.5071875 



0.4878153 



jSA1097 
1SA1098' 



176 



0.4181183 



0.4006946 



1694 



0.9945483 



0.9927579 



ISA1099 



215 



0.4839115 



0.4649714* 



ISA1100 



<def 



548 



0.8147404 



0.7969153^ 



ISA1101 



623 



0.852915 



0.8367231 
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SANUMBERvl GENE_NAME..J:. . SIZE 




'SA1102 
iSAHOS 

:SAVld4* 
:SA1107 

•SAliOB 
"SAniOG 
rS A1110 * 
'SA1117 
"SAlil9 
:SAil20 
:'SA1 122 
.SA1126 
SAl"l27 
SA1129 
= SA1131 
•SA1132 
SA1133 
iSA1 134 
SAVi37 
.SA1140 
•SA1141' 
iSA1144 
'SA1145\ 
:SAJ146"' 
igA1 147 

!SA"'i1 48 

1SA1149 _ ...pheL__ 
3SA1150 " Tnhb _ 



kdtB 
rpmF 



pheS 



1109 
974 
1289 
536 
1091 
794 
806 
188 
164 
479 
122c 
45£ 
200 
431 
251 
386 
539 
479 
170 
205 
680 
962 
731 
320 
737 
^055 
'J399' 
935 



PROBABILITV^aJ.-**^ 
CK9670245 

'* "6.9500456' 



0.981^0468 

08077728 

0.9651468' 

0.9130872 

0.9162375" 

_0.4392094 
0.3962339' 

^6T77'09258' 
0T9767797 

"6.7556368 
■0.45'95361 

"6.7344708 
bi38b2"'l 
6.69'50422" 

_0. 8095389 
0.7709258^ 

"o" 4072771 
6.4677865 
_ b;876574 

^ 0.94*81668 ' 



0.89Ji4976 
■"0626382^ 
"0.8964273 



0.9610646 



0.9993769 



0.9436772 
0.5547661 



LOyyER.^;-.'. 1 
0.9602872 
JK9411854 

" 0.9764752 
0.7897008 
6_.9581523 
6.9007135 
0.9641196 
0.4212542 
6.3794046 
0.7517728 
0.9714959 
0.7361361 
0.4411085 
6.7145759 
0.5181676 
0.6746566 
0,7915281 
0.7517728 
0.3901425 
0.4491788 
0.8616709 
b.'9396961 
0'8867435 
676057933 

1 _0i82807 
"'0.9535322 
0.999068_5 
6.9341l'96' 
" *b.5346972' 



1SA1156 




92 


0.2465183 


0.2348076 


»SA1161 


-murl 


797 


0.9138857- 


0.9015762 


ISA1163 




500 


0.7852581 


0.7664829 








326 


0.6332166 


0.6126141 


I5;A1165 




104 


0.2738294 


0.2610581. 


1 


SA1166 
SA1168 




398 


0.7060958. 


0.6858177 


1 


fib 


494 


0.7812672 


0.7623713 


1 


SA1169 


fib 


347 - 


- 0.6561649 


0.6355708 


i<;aii70 




242 


0.5250502 


0.5053862 


! 


.c;aii71 




182 


0.4287612 


0.4110641. 


ISA1173 




956 


0.9472011 


0.9380237 


ISA1174 




143 


0.3559372 
0.4494666 


0.3403112 
0.431268 


ISA1175 




194 




SA1176 




245 


0.5294138 


a5096839 




SA1177 




128 


0.3255175 


' 6.3T08884- 




SA1179 




722 


0.8915355 


0.8775801 




SA1181 


arcB 


998 


0.9536013 


0.9451516. 




|SA1185 




185 


0.4340094 


0.4161814 




ISA1186 




131 


0,3317143^ 


0.3168761 




iSA1187 




131 


0.3317143 


0.3168761. 




ISA1188 




692 


0.8810477 


0,8664164 




|SA1189 




437 


0.7393275 


0.7195144. 




ISA1193 




398 


0.7060958 


0.6858177* 
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1 


SANUMBER J. GENElNAME J::^: 


: SIZE; 




PROBABILITY. - - I^.U:.- 


.LpvyEB.r..fi:;| 




Pbpl 


. 


2231 




0.9984814 


SA1 195 


V/ 

mrar 


— • 


96? 


0.9481668 


0.9390961 


SA1196 


murD ^ 


— 


1346 


6^9840955 


0.9800697 




divlB 


. 


1316 


n Qfl9S577 


0.9782522 


SA1198 


fisA 






n QRRfiQ7Q 


0 9834071 


■SA1199 


ttsZ 




1 169 


... p'^'^^Q^q 


0 9666474 


^A1200 _ 










0 8542348 








0/ 1 




0 8580015 


SA1202 




yImF 


- 

— - 




n R9 14*^*^4 




:SA1264 


yImH _ _ 


... 






0 9032799 


'SA1205 






RliS 




0 8323918 

\/.ww&.ww iw 


'SA1206 


ileS 






u.yyy f 00** 


n QQQRR4^ 


'SA1209_ 






Qi A 


u.yoya 1 0 1 


n Q9QQRQR 


SA1212 


«. « .« . — . ..^ 

_ PyrB ^ 


. 


it7f- 
o/c 


u.yozooL/y 


n Q999'^R 


SAT2I8 









n ^7/19QKl 

u.** f '*zyo 1 




;SA1219 




- 


000 


n TORHQ'^R 


w.OO«JO lit 


■SA1222 




rpqZ 






n /IR'^Q1 1 *^ 

u.^ooy 1 1 0 


n ^RdQ71^ 


:SaV225 




. — . . . 


you 


0 Q/lROIV/l 

u.y^DZ 1 f *+ 


n Q'^RQ'^9'=i 


•SA1226 






£.10 


u.o f uyu*»z 


n '^'^fiR'iQft 


SA1234 


. — .. . — . 




ft*?*? 


u.yoioou** 


n Q9nRRflR 

u.y^uoDoo 


'SA1235 _ 


• . ..... 

. ...rpe 




a.A "I 
o*n 


u.oDuooy 1 


n ft^sn<^9i^ 


rSAV236"'" 




. 


044 


U.OOZI I f D 


n fi^R^QAO 

u.o^Doyoy 


rSA1237 


. . . ^« . . . . 










!SA1238 ' 


^. — 

rpmB 


— 


lot5 




U.*T 10 10 I** 


:SA1240 







1D4o 


u.yyoozzi 


n QQl (^QOT 


ISA1242 









0.8263314 


0,8089502 


'SA1243 


:plsX 




yoo 


u.yDi^uyo 


u.y*iz / uoo 


iSA1244 


ifabD 







u.y4oi o*iy. 


n Qk'X'x^Aory 
u.yooo*fZ<c' 


:SA1245 


:tabG 


- 




u.oy*i4y r D 


n RRn7/i*^^' 
u.oou/ ^00. 


fSA1246 






yo 






ISA1247 


.acpP 






U.OUf 1 Of 0 


n ilfl7RlMi 


1SA1248 


• rnc 




"700 


u.oy«^u lyo 


n A7QKQA9* 

U.O / yoyoz 


-4SA^^1 


4tey 





— 




n Q7'^^1R1- 


ISA1252 






o^^y 


U.OOOOOD4 


u.D 1 oy ou 1 


1 


SA1253 


;ffh 




1od4 


u.yoyozo' 


u.yo luoo*^ 


1 


SA1254 


• rpsP 




£.1 £. 




n '^4R7919 


j 


SA1255 


IrJmM 


• - ■ • 


. puu 




n 7RRilA9Q 


ISA1256 


•trmD 




/ OH 




n RR177Qfl' 


ISA1257 


irpIS 




O** / 


n Rf^RIRilQ 




iSA1260 






00 1 




0 Q99Ql'^fi' 


ISA1261 




rnhB 









0 RQifiSQS 

V . w v7 1 wwww 


ISA1263 


•sucD 






n Q'^R2319 


0 99811 19' 

w.w^U 1 1 Iw 


ISA1264 


jlylN 




i i4n 

1 1 **o 


0 Q70753- 


0 9645465 


iSA1269 


!xerL» 




"ciciV 




V/. wiC v^w wU^i 




SA1270 


ihslV 




542 


0.8112888 


0.7933395' 




SA1273 






89 


0.2395316 


0.2281006. 




SA1274 


irpsB 




773 


0.9072864' 


0.8944591s 


ISA1275 






110 


0.2871114 


0.2738437- 


ISA1276 


asi 




878 


0.9328809 


0.922238 




ISA1277 


;pyrH 




719 


0.8905297 


0,876507 




1SA1278 






551 


0.8164424 


0.7986799 




ISA1279 


uppS 




767 


0.9065591! 


0.8926008: 



12 



wo 2004/018624 



Table I 



PCTAJS2003/025879 



I:- SANUMBER-:--!- GENE_NAME 

■ SA1280 cdsA 

.SA128 2 ' Pr?S r. 

.SA1286 _ " "V ' 1 

SAi287 " * 

L^A1?.91^. - 

' SA1 292 = fP?0 

SA1295 

= SA1296 „ 

'SA1297 2,. .. .. 

SaT299 " J 

sa'i36o 

SA1301 

'^A13q2_ .^.J.PQSA . 

1SA1364_' '[ _ 

SM30b 

SA13b5 

SA1308 _ _ 

rSA13ioJ*^J 

'SA1311*" *" " / ' , ] 1. 

SA1313 

rsA1315 hexA 

Ts A13J 7 ' gIpP " " 7. 

fsAI 31 8 ' * 7." . . 

iSA1322" ______ 



SIZE. 



ISA1323 



:miaA 



'SA1324 



77S 
1700 
'281 
314 
968 
' 266 
2375: 
710 
1262 
701 
824 
38£- 
57t 
1040 
1556 
215 
1757 
290 
134 
362 
251& 
' 535^ 
'l58 
911 

" ^932 
" 230 



PRpBABIUTYcfvJfe 
^ " 0.9089822 
0.994648 " 



0.5787526 
0.619422 
"0.9491149 



0.5588567 



0.9993292 
0.8874562 
0.9794052 



0 .8842 963 
d.920750T ' 

0.697844 
0.8295 079 

^0.9592257' 

76;99V6646' 
0.4791 26 
0^9955089 
0.5902568' 

' 0.337854 1^ 
0.6716721 
0.9995693 
0.8095389 

* 6.384985 
'6;93936T 



0.9431549 



0.5071875 



0.8962853 
0.9928832 
0.5584345 
CL5988525 
0.^01499 
0.5387403 
0.9990011 
0.8732312 
6.974553 
0.8698684 
0.9090109 
0.6774835 
0.8122558 
0.9514597 
0.9891805 
0.4602817 
" 0.9939706 
"0.5698451 
0.3228118 
0.6511307 
0.999343 
0.7915281 
' 0.3684777 
JO.9293558 
" 0>93 3'542 2 
'^""b.4878153' 



ISA1325 



.gpxA 



!SA1327 



1SA1328 



473 
J235 
365 



0. 7666578. 

0.9776213." 

0.6746886 



0.7474023 
"0*.97"24737; 
6.6541 621* 



jSA1330 



!SA1331 



1SA1332 



VP 
194^ 

"i221 * 



0.2871114 



0.4494666 



0.493351' 



-tSA1333 



-205- 



-G:4646G46^ 



0.27 38437 
_ 0.431268 

0^742287' 
^4459646- 



1SA1334 




107 


0.280501; 


0.2674788 


1SA1335 




182 


0.4287612= 


0.4110641: 


ISA1336 




101 


0.2670958. 


0.2545811 


iSA1337- - 




- . .92 


0.2465183 


0.2348076* 


ISA1338 




248 


0.6337373 


0.5139443' 


ISA1339 




581 


0.8326263. 


0.8155043 


ISA1340 




245 


0.5294138 


0.5096839 


iSA1341 




98 


0.2602999 


0.2480474- 


iSA1342 




263 


0.5547661 


0.5346972 


!SA1343 




134 


0.3378541 


0.3228118 


ISA1344 




131 


0.3317143 


'0.3168761 


ISA1345 




287 


0.5864574. 


0.5660748 


|SA1346 




191 


0.4443617! 


0.4262831 


{SA1347 




338 


0.6465111. 


0.6259037 


5SA1348 




188 


0.4392094 


0.4212542 


iSA1349 




1022 


0.9569039 


0.9488503 


1SA1350 




194 


0.4494666' 


0.431268' 


ISA1353 




728 


0.8935193 


0.8796982 


ISA1354 




1088 


0.9648236 


0.9577855. 



13 



wo 2004/018624 



Table I 



PCT/US2003/025879 



:SA1355 


SIZEr^::: . L 
599 


PROBABILI3pfe::|fe'f 
0.8416434 


r LOYyE% .| 
0.8249163 


'SA1358 




851 


0.9270673 


0.915884 







104 


0.2738294 


0.2610581 


• SA1366 


- ■ • ■ 


J]^" "311 


0.615893 


*0.6953363 


iSA1367 


. — 


1451 


0.9884861 


0.9853154 


»5?A1368 




1416 


0.9872557 


0.9838359 


O 1 >J\J\J 


rpmG 


155 


0.3792822 


0.3629423 




•rpsN 


266 


0.5588567 


0.5387403 


!SA1372 




^101 


0.2670958 


0.2645811 


:SA1374 




620 


0.8515512 


0.8352918 


~Sa"i375 


- 


230 


0.5071875 


~ 0.4878153 


:SA1378 




284 


0.5826228 


0.5622713 


:SA1379 


- - 


95 


0.2534409 


0.2414564 


SA1380 


-~ - 


454 


0.7601063 


" 6.7467617 


:SA1383 


mscL 


434 


0.7369104 


6.7170559 


iSA1386" 




464 


6,7601063' " 


" 0~7407017 


SA1388' 


- • 


505 


1X8445398 


0 8279457 




SA1389 




, parE 


1994 


0 9978339 


0 9969741 


'SA1390 


.parC 


2399 


0 9993769 


0^990685 




SA1391 




95 


0 2534409 


0 24 1 4 564 


''S A 1393 


- • ■ 


848 


6'92639i~ 


0 9151468 




SA1394 







Vj * w C ^ O %J k7 


0 37Q4046 


!SA1397 


— - - - • • • 

• msrA 


506 


0 7891858 


0 77052*^4 


ISA1399 


'dmpl 


182 


0 4287612 


0 41 10641 


:SA1400 




1259 


0 9792142 






SA1404 


itrpG 


563 


0.8230957. 


0.8055863 


1 


SA1406 


rirp^ 


779 


0 9089822- 


0 8962853 




SA1407 


jirpr 


629 


0 8556053' 


S 8395481 


iSA1409 


i IrpA 


683 


0 8777079! 


0 8628729 


:SA1410 


ffemA 


1259 


0 9792142- 

w*wf 1 


0 97433 


!SA1411 




1256 


0 9790215' 


0 974105 


1 


SA1412 


^ 


764 


0 9046833' 


0 8916*^<55 




-SA144^ 


» 


470 


0 7644Q41 


n 74*^1 RRo 


i 


SA1414 




698 


0 8832234 


0 8687278 




SA1418 




341 


0.6497588; 


0 6291542 




SA1421 


. 


848 


0 926391 ' 


0 9151468 




SA1422 




914 


0.93991811 


0 9299696 




SA1425 




293 


0.5940213; 


0 5735828 




SA1426 




899 


0.9370803' 


0.9268462 




SA1427 





1598 


0 992675 


0 9904248 




ISA1428 





1202 


0.9752299 


0.9697003 




5SA1431 


1 UOfJO 


719 


0.8906297 


0.876507 




ISA1433 




1148 


0.970753' 


0.9645465 




ISA1435 


lysA 


1262 


0.9794052 


0.974553 




1SA1436 




398 


0.7060958 


0.6858177 




ISA1437 


cspD 


197 


0.4545246; 


0.4362098 




ISA1438 




305 


0.6087366: 


0.5882113 




ISA1440 




626 


0.8542664- 


0.8381417 




ISA1441 




1133 


0.96937171 


0.9629652 




jSA1442 




389 


0.697844; 


0.6774835 




ISA1446 




200 


0.4595361: 


0.4411085 




ISA1450 


arlS 


1352 


0.9843864: 


0.9804145 



14 



wo 2004/018624 PCTAJS2003/025879 

Table I 



-SA1451 


anK 


bob 


U.ob/ 1 1o4 


U.0OI ODoo 


'SA1452 




b 1 1 


O.o47oo32 




•SA1453 


— ' 

murG 


lOo/ 


0.9624759 


0.9oo12d3 


SA1454 




293 


0.5940213 


U.57o5o2o 


SA1456 


. — 

. . . — . 


218 


0.488653 


0.4o9d202 


-SA1458 




425 


0.7295237 


0.7095503 


:SA1460 


.... 

degV 


836 


0.9236226 


0.9121324 


SA1461 


folA 


476 


0.7688016 


0.7495971 


SA1462 


;ihyA 


953 


0.9467115 


0.9374805 


SA14(53 




92 


6.2465183 


0.2348076 


SA1464 


- .... 


434 


0.7369104 


6.7170559 


.SA1466 




246 


0.5337373 


0.5139443 


'SA1467 


. . 


216 


0.488653 


0.4696202 


SA1468 





701 


0.8842963 


0.8698684 


■SAI473 


... .. . . , 


179 


0.42346431 ^ 


~J) .405902 


;SA1474 "* 





95 


0.2534409 


0.2414564 


:SaV477" * 7^ 


ilvA 


1037 


0.9588476 


0.9510342 


SA148 l'^ " " 




_ J 1337 


0.983649 


_ 0.979541 


•SA'i^si 


..— . . ..... . 


' "'7 329 


676365864 


_ 6.6159801 


'SA1484* 


. . - 

•divlVA 


341 


0.6497588 


0.6291542 


ISA1485* 






0.8214554 


0.8038822 


SA1486 


., 

_« . - . ..- , .. . .,1.-, 


"..„, 347 


0.6561 64^ J 


"^6355708 


{SA1487 




"...Z.J?'' 


' 0.3906354 


' 07373965 


.SA1488 





122 


0.31 2951 « 


0.298755 


:SA1489 


jrecU 


623 


0.852915 


0.836723 


ISA1490 


ipbp2 


2180 


0.9987778 


0.9982386 


SSA1492 


inth 


656 


0.8671154 


0.8516683 


;SA1493 




683 


0.8777079. 


0.8628729 


1SA1496 


, , . . , . „ , 


968 


0.9491149. 


0.9401499 


ISA1497 


. 


1097 


0.9657843 


0.9588764 


«SA1498 




1139 


0.9699319. 


0.963606 


jSA1499 




314 


0.619422' 


0.5988525 


'i>Ai500 




fOA 


U.B853593; 




iSA1502 




572 


0.827927> 


0.8106102 


!SA1504 


aroA 


1295 


0.98139351 


0.9768823 


1SA1505 


aroB 


1061 


0.9617768 


0.9543362 


ISA1506 


aroC . . 


1.163 


0.97207211. . 


.0.9660602 


ISA1507 




197 


0.4545246 


0.4362098 


1SA1508 




122 


0.312951; 


0.298755 


!SA1509 


. .... 


446 


0.7464464. 


0.7267625 


:SA1510 


. . _,. .. 


956 


0.947201 1 


0.9380237 


ISA1511 


— 


599 


0.8416434! 


0.8249163 


ISA1512 


, ..... 


569 


0.8263314- 


0.8089502 


{SA1513 


— . . 

;hUp 


269 


0.5629096' 


0.5427482 


iSA1515 


•b2511 


1307 


0.9820679; 


0.9776753 


;SA1516 


^rpsA 


1172 


0.9728348- 


0.9669372 


1SA1517 




113 


0.293661; 


0.2801533 


ISA1518 


•cmk 


548 


0.8147404- 


0.7969153 


:SA1521 




113 


0.293661 j 


0.2801533 


ISA1525 




245 


0.5294138! 


0.5096839 


ISA1627 




116 


0.3001504! 


0.2864081 


ISA1535 


JsrrA 


722 


0.89153561 


0.8775801 



15 



wo 2004/018624 



Table I 



PCT/US2003/025879 



1 .SANyMBER,i|v:GE^ | 
SAisSe'""" riuB 


I.-.- 


734 


P ROB AB 1 LlXy^ i 
o'y8954669 


LOWERk-'n 
0.8817798 


SA1537 







639 


0.8095389 


0.7915281 


•SA1541 


, -^v - 





' 44fi 


0 7464464 


0.7267625 


SA1542 


- .- 







0 8095389 


0J915281 


.SA1544 


. . „ . -.^ . 

- 









0 5294138 


0.5096839 


SA1547 


-- - 







0 3499651 

\J . w^ w w W w 1 


0 3345289 


SA1548 _ 








n Q4 04701 


0 Q305781 


.SA1'551 


^ . - 

malA 




10*10 




0 9916727 


SA1552 




■ 


1 u 10 


n Q^ifiioi 


0 94 794 97 


:SA1553 







ODO 


0 fi77fi774 


0 6571671 


SAl'556 







1 fO 


u.*i 1011 00 


0 400RQ4fi 


SAl'557 










n Q41RQfi'S 


•SA1558^ 


_ . , 







0 7*^^50104 


0 7170'S'iQ 


'SA1559 


_ . . 


. 




0 9'^*^440Q 


0 9414*^64 


•■SA1562 








yoy 




0 Q4'^RQRR 


•SA1564 J" 


recN 





1000 


n QQ1 lOP*^ 


0 QRR4QRQ 


SAi'565" 




- - 


*»*iy 


0 74fl77*iQ 


0 79Q1'^R7 


SA1566 


LSPA 





O/O 




0 Q999'^ft 


SA1567 


.... 


^Zi 


0^1t9R17R 




SA1570_ 







ooy 


n fiRRR977 


0 fi4R079fl 


ISA1571 * 




accC 






u.yo*tooD*i 


0 QA0414S 


•SA2572 


accB 





ACt 


n 7'>7RR1 Q 


n 7'^R49RQ 






. 




n Q*^R9'^'^R 

u.yoDzooo 


0 Q4R100Q 


SAi674 


« ^ - . 


,. 


000 


n fiAf^'k't 1 1 

1 1 1 


0 R9'^Q0^7 


!SA1575 








U.DOODOr 0 


n RIRRQRf^ 


:SA1576 








n o'^oRno'^' 
u.yoyouwoi 


0 Q'\1RR14 


1SA1577 








u.yyDODOu 


n QQ^'^IRR 


ISA1578 






1 000 


n QR2lCi9QQ5 

u.yoMO^yy 


0 QR0'=ift47 


!SA1579 






^4y^ 




n QQQ9ftQ'^ 


!SA1580 






OOU 




0 RfiflQ9R9 


•SA1581 










0 '^9fiCin4R 


•SA1582 










• n Q^'^i9iiQ' 










n orjir9'^r- 


, 0 0'=;77ft'5^ 


-rSA-1-583 






■ 1UuU 






1SA1584 






^yy 




n ^AOQROQ 


;SA1585 






01 f 


n R99Q1RSt' 


0 R09'^*^ft 


ISA1586 








n «^7fl7'^9R- 




ISA1587 


jefp ■ 






0 R1 fl19RQ' 


0 R0049Q9 


ISA1590 










0 4R4Q714 


ISA1591 


• 




ft97 


0 Q9147fi9i 


0 9098015 


ISA1596 


aroK 




OZ 1 


0 7QftRQ*^7- 


0 7R0321? 


!SA1597 










0 76P3713 

V/. f V^£.W f 1 w 


ISA1598 








0 7440953. 


0 7243675 


jSA1599 


• coidGC 






0 6123313- 

V#*w 1 Iw- 


0 5917894 


ISA1601 


:gspE 




Q71 


0 9495824 

W.w~ i^wU^ » 


0 9406699 


ISA1602 






620 


0.8515512 


0.8352918 


JSA1603 






326 


0.63321661 


0.6126141 


^SA1604 


,glkA 




983 


0.9514098i 


0.9427053 


ISA1605 






200 


0.4595361 » 


0.4411085 


1SA1608 


TpinG 




146 


0.3618545i 


0.3460432 


tSA1610 


isodA 




596 . 


0.8401751 


0.8233816 


1SAI6II 






407 


0.7141223; 


0.6939366 


:SA1613 






782 


0.9098184; 


0.8971864 



16 



wo 2004/018624 



Table I 



PCT/US2003/025879 



1. SANUMBEfS^j'^^ENE^NA " 
'SA1617 


Of** 


n U.7A07AA 


LM y y lir\-.'» ^ • 1 


rsAie'ls"* \ 

iSA1619 


.rpoD 

TdnaG 


179d 


0^9JB64101 

u.yyouio/ 


0.9595879 
u.yy^D I/O 


SA162q 




_ O lO 


u.y 1 00^0 




SA16222^ ^" 
•SA1623 





looo 
749 


U.yobU^o4 
0.9001 81 4 


u.yo^oD 1 y 

U.OODOiC/** 


"SA1625' 


"™ cdd 


401 


0.70o79o1 


U.D0004 r f 


:SA1626" 




341_ 


0.6497588 


0.6291542 


•'SA1632 7 
rSA1634 


rpsU 


173 
749 


0.4127227 

o.yooioi4 


0.3954416 
U.ooDo/: / *« 


:SA1635 


-prmA 


935 


0.9436772 


0.9341 196 


:SA1637 


;dnaK _ 


1829 


0,9964013 


0.9951 1 


'SA1639' 


hrcA 


974 


0.9500456 


0.941 1854 


SA1643" * 




971 


0.9495824 


0.9406699 


SA1644' * 




2144 


u.yyobo4b 


u.yyoU44i 


SA1645 




428 


0.7320087 


0.7120741 


•SA1646 _ 


comEA 


_"683 


0.8777079 


0^628729 


:SA1648_^ 






0.6593238 


0.6387373 


:SA1649 




"581 


0.8326263 _ 


0.8155043 


'SA1650" 




566 


0.824721 


0.8072756 


SAiesf'"' ' 




'^287 


0.5864574 


0.5660748 


• .SA165r 




* ^7^. 1997' 


0.9J57843_ 


0.9588764 


'SA1654 




524 


0.8005432 


0.78223 


'SA1655 


ipis 


683 


6.8777079 


0.8628729 


SA1656 




266 


0.5588567 


0.5387403 


ISA1657 




701 


0.8842963 


0.8698684 


iSA1658 




1217 


0.976347: 


0.970994 


1SA1664 




731 


0.8944976- 


0.8807435 


iSA1668 




920 


0.941017: 


0.9311813 


JSA1669 




635 


0.8582463 


0.8423243 


'SA1670 




305 


0.6087366. 


0.5882113 


SA1671 




425 


0.7295237i 


0.7095503 


:SA1672 




257 


0.5464708 


0.5265046^ 


1SA1673 


iaIaS 


2627 


0.999691: 


0.9995201 


ISA1676 




665 


0.8707444: 


0.8555013 


ISA1676 


IrmU 


1115 


0.9676276: 


0.9609743 


ISA1677 


... .... 


-1139 


0.9699319. 


0.963606 


ISA1679 




143 


0.3559372: 


0.3403112 


ISA1680 




179 


0.4234643, 


0.405902 


iSA1681 




41^^ 


0.7244o44- 


O.f044obo 


ISA1683 


imoeB 


770 


0.90d42d7 


O.o9o5o4 


!SA1684 




125 


O.olyzDooi 


U.ovJ4o4o^ 


iSA1685 


<aspS 


1763 


0,9955915 


A OA>1 

u.yy4u#o 


ISA1686 


*hisS 


1259 


0.9792142! 


0.97400 


ISA1693 


iVaiC 






0 5265046 


1SA1694 


itgt 


1136 


0.96965311 


0.963287 


JSA1699 


iobg 


1289 


0.9810468! 


0.9764752 


tSA1700 


irpmA 


281 


0.5787626 


0.5584345 


!SA1701 




317 


0.622918& 


0.602338 


ISA1702 


irpiU 


305 


0.6087366! 


0.5882113 


:SA1706 




281 


0.57875261 


0.5584345 


iSA1707 


:radC 


683 


0.8777079: 


0.8628729 
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SA1708 




704 


0.8853593 


0.8709991 


;SA1769 





1268 


0-9797819 


0.97£9933 


SA1710 


— - — 

valS 


2627 


0.999691 


0.9995201 


* :SAi713 




98 


0.2602999 


0.2480474 


SA1715 




• hemB 


97 1_ 


0.9495824 


0.9406699 


SA1716 




665 


0.8707444 


0.8555013 


SA1719 


hemA 


J^343 


0.983948 


0.979895 


:SA1720 




687 


0.8356876 


0.8186965 


rSA1724 


. . — . — . .... • 




O.M45398r 




:SA1725 


:rprT 


^^353 


0.6624538 ^ 


' 0.6418764 


SA1726 


.rpml 


197 


0.4545246 


0.4362098 


SA1727 


inf C 


*624 


0.8005432 


6.78223 


SA1729 


.?hrSj_ 


1 934_ 


0.9973947 


6^996'3971 


:SA1730 




^ 1 1*9 


^ 0.3065803 


0^926085 


SA1731~ 


„ _ 

dnal 


917 


0.9404701 ' 


' " 0.9305781 


.SA1732 




'1397 


**" 0.9864051^ 


" 0.9828177 


•SA1734 


... ... 

-gap 


""l022 


0.9569039 


0.9488503 


SA1735 




620 


0.8515512 


0.8352918 


•SA1736* 




:ip9. ... 


869 


0.9309965 


0.9201752 


SA1741 


icd 


12'65 


0.9795944* 


" 0.9747741 


.SA1747 


:accA 


941 " 


0.9447074 


*" 6.9352595 


SA1749 




1226 


0.976993 


0.9717436 


:SA1750 


dnaE 


,13194 


0.999946 


0::9999p78 


SA1755 




140 


0.3499651 


0.3345i289 


SA1757 




95 


0.2534409. 


0.2414564 


!SA1759 




497. 


0,7832669 


0J644361 


:SA1761 




944 


0.9452154- 


0.9358221 


•SA1762 


!SOi8 


491 


0.7792289. 


0.7602885 


:SA1765 




1136 


0.9696531' 


0.963287 


•SA1766 




137 


0.3439375, 


0.3286959 


SA1768 




461 


0.7578819 


0.7384289 


:SA1769 


:rpsD 


599 


0.8416434^ 


0.8249163' 


.SA1770 




740 


0.8973/89' 


0.8838253 


{SA1776 




614. 


0.8487854 


0.8323918 


!SA1778 


ityrS 


1259 


0.9792142 


0.97433 


.SA1779 




902 


0.9376584; 


0.9274818 


SA1780 


• 


89. 


0.239531 6i 


0.2281006 


iSA1783 


tecs 


1703 


0.9946972! 


0.9929451 


1SA1789 




488 


0.7771817 


0.7581873 


'SA179q_ 


imurC 


1310 


0.9822327* 


0.9778693 


!SA1792 




593 


0.838693. 


0.8218335 


:SA1793 




854 


0.9277374 


0.9166149 


;SA1^794 




308 


0.6123313! 


0.5917894 


:SaT797" 




839 


0.9243243 


0.9128959 


:SA1798 




641 


0.8608391: 


0.8450525 


1SA1799 




722 


0.891 5355i 


0.8775801 


•SA1802 




419 


0.72448441 


0.7044363 


'SA1804 




1658 


0.9939097' 


0.9919584 


ISA1807 




308 


0.6123313. 


0.5917894 


*SA1808 


ileuS 


2414 


0.999405; 


0.9991083 


SA1811 




560 


0.821 4554i 


0.8038822 


5SA1812 


trot 


398. 


0.7060958! 


0.6858177 
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|. SANUMBEB:%i|i,GEN 


624 0.9207501 


LOWER-VJ 1 
0.9690109 


SAlBl's^'^ ^ 
SAVsir 


"jrlbH * 


116 
'_461_ 


0.3001504 _ 
0.7578819*' 


0.2864081 
0.7384289 


SAI826 
•SA1824 


■ ribD 


1001 

392 


0' 70062'^' 


0.94562K 
0.6802859 


SA1826 




^21 




0 4742287 


SA18277' / 




545 


0.8130225 


6^7951352 


•SA1828** 


, , , .... 


440 _ 


0.7417224 


6/7219516 


.SA1830 







0.8569319 _ 




SA1831 




710 


0.8874561 


"n ft 7 ^9'^ 19 


SA1832' 
SA1834 

JSA1836 _ 




•crcB 

. . . ^. , 


362 
1606 

_908 


0.6716721 
0.9928531 
0.9387987 ^ 




•SA1837 

SA1840""7.7 
SA1841 


, 

-rpelK 




1190^ 
767^ 
530 


u.y f 4zyoo 
0.9055591 
0.8041914 


firQfiftfi^'^Q 

ri flQ9Ronfl 
' ^ 785998 


SA1842 





"_254 J 


0.5422654 




SA1843 


menC 


998 " 


0.9536013 




•SAI844" 


rnenE 


1475 


0.9893066 




SA1848 




623 


0.852915 


U.OODf jcO 


:SA1849 




341 


0.6497588 


n KOQi *vA9 


:SA1852 




?S2_ 


0.4287612 


n >l 1 1 i 
U.*t 1 IUD*t 1 


:SA1853 


„ . . 


173 


0.4127227 




:SA1856 




98 


0.2602999 




»SA1857 




374 


0.6835729 


0 6630989 


^SA1858 




557 


0.8197998. 


U.OUZ 1 DOZ 


ISA1859 




3047 


0.9999151 




:SA1860 




227. 


0.5026178 




ISA1861 


ihsdS 


1196 


0.97476841 


n QRQiRAA 
U.«7Di7 IDOO 


SA1863 




131 


0.3317143 


U.O i DOf D 1 


iSA1865 




713 


0.8884901. 


n ft74'^'^97 


SA1866 




716 


0.8895146; 


U.Oi 0*fZ*»D 


!SA1869 




713 


U.W84901: 


U.O / HOO^i 


:SA1870 




563. 


0.8230957: 


u.ouooooo 


'SA1871 




695 


0.8821406: 


n R*^7*!i77i 
U.OO/ Off 1 


•.SA1873 


tepiF 


689 


0.8799447} 


U.OD0Z**00 


•SA1876 - 


iepiC 


- 1241 


0.9780306i . 


n Q79Q'> 


:SA1879 




1316 


0.9825577= 


n Q7R9R99 


ISA1884 




113: 


0.293661. 




ISA1885 




551 


0.8164424. 


n 7QftR7QQ 


1SA1886 


. .. . 

— 


224 


0.4980058 


n ii7fl7Q79 


ISA1887 




1397 


0.9864051 


0 Qft9fi177' 


;SA1889 


J heme 


1 t/o*i 


n Q'^846fi- 




ISA1890 




107 


0.280501! 


n 9fi747flR 


:SA1894 




419 


0.7244844. 


0.7044363 


ISA1895 




362 


0.6716721' 


0.6511307 


ISA1896 




554 


0.8181289 


0.8004292 


:SA1897 




959 


0.9476862 


0.9385623 


ISA1898 


icbfl 


938" 


0.9441946- 


0.9346921 


:SA1901 




95. 


0.25344091 


0.2414564 


!SA1902 




341 


0.6497588. 


0.6291542 


!SA1904 




461 


0.7578819: 


0.7384289 
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S!ZE^:V, >.feAPRgBABJD33^^i|^^^ 




'SA1905 


^vraR 


620 


0.8515512 


0.8352918 


SA1907 


- - ■ - 


818 


0.9192736 


0 9074089 


SA1909 




392' 


0.70062^ 


0 6802859 


• SA1910__^ 


- 


359 


0.6686277 


*d 64 8072 8 


SAI9V1 




149 


0.3677175 


0 3517255 

\J m^J^^ t t ^^^^^ 


SA1912 




596 ^ _ 


0.840175 


0 8233816 


SA1913' 


, . . « 


467 


0.7623104 


0 7429547 


SA1915 




725 


0.892532 


6"8786438 


SA1918 " 




1316 


6.9825577 


V./.C7 * >JC,£. 


•SA1919 


, 


443 


0.7440953 




SA1923' 





1091 


0.9651468 




.SA1925 




539 


0.8095389 _ 




SA1928 




833 


0.9229144 




SAI930" '\ 




'"287 


0.5864574 


U.OOOU f *fO 


.SAI934" 




158 


6*384985 


W.OUO*T Iff 


•SA1938 




'263 


0.4645016 




SA1940 _ _ 




272 


0.5669254" 


U.O^Df £.1 ^ 


SA1945 " 




383 


0.6922144 


U.D/ IOU*H7 


'SA1946 


map 


'755'" 


0.9020072 ^[ 


flTft ft R7 ft 


SA1950 




728 


0.8935193 


U.Of vfD90Z 


:SA1952 




497' 


0.7832669' ' 




:SA1953 




89 


0.2395316 




ISA1958 




944 ^ 


0.9452154 




:SA1959 




' *143 


6.3559372 




•:SA1961 


igalA 


1464 


0.9885919 


0 985443 


ISA1962 


igatC 


299. 


0.6014469 


u.oovjyouy 


ISA1964 




1196 


0.9747684 


n QRQl f^Rft 
U.«70y 1 DDO 


iSA1965 


lligA 


2000* 


0.9978735 




ISA1971 




323 


0.6298155 


U.OUoZ 1 00. 


!SA1972 




170. 


0.4072771 


U.O«7U If ^0 


.SA1974 


inadE 


818 


0.9192736 


n QHT/inftQ 

•u.yu / ^uoy 


:SA1975 




1439 


0.9880531' 


n Oftil7Q^ft^ 


ISA1982 


iPPaC 


926 


0.9420958. 




ISA1983 




170 


0.4072771 


n '^Qfii/iOR" 

U.«3yU l*4^0t 


tSA1987 


iccoS 


170 


0-4072771' 


u.oyu If 


!SA1990 


1 


479. 


0.7709258: 


n 7'^177'>ftJ 


1SA1992 - 




560: 


. 0.8214554' . . 


n ftn'^ftft99- 

U .OUOOQiCiC. 


!SA1994 




869' 


0.9309965: 




ISA1998 




173. 


0.4127227 




;SA1999 




95. 


0.2534409 




;SA2005 




140 


0.3499651 




JSA2006 


llukM 


1052 


0.9607036* 




JSA2010 


— 


908 


0.9387987 


n Q9fl7^Rfi! 


1SA2011 




1304 


u.yoiyui / 


n Q7747Q7 


ISA2012 




440: 


0.7417224 


0.7219516 


iSA2014 




587, 


0.8356876 


0.8186965' 


.•SA2016 


igroEL 


1613' 


0.9930054 


0.9908337, 


:SA2017 


tgroES 


281 


0.5787526 


0.5584345 


;SA2018 




740 


0.8973789 


0.8838253 


{SA2021 




782 


0.9098184 


0.8971864- 


:SA2022 


Ihld 


131* 


0.3317143 


0.3168761 


SA2024 


iagrD 


137. 


0.3439375 


0.3286959' 
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1 . SANU MBERv. | ♦ GENE NAMg^Jr.... 
.SA2d36 * scrR 


y*i / 


PKOBAB!Lll:5fi^^|i^gv: LOWE^IIl-.;. J 
'""^'0.9457187 ' 0.9363797 


•SA2031 


amt 




0.97o4o25 


u.y/o4ioi 


:SA2032 




199 


0.312951 _ 


0.29o7op 


SA2033 




. . _ 
1 


0.493351 


0.47422o7 


tSA2634 


. .. - . ■ _ 


x700 


0.9514098 


0.9427053 


iSA2038 


..»-9cp 


in90 


0,9569039 


0.9488503 


SA2046 




0.8683362 


0.8529571 


•SA2041 


^ ... . . _ - 




0.7344708 _ 


0.7145759 


'SA2044 


« ... 

-jlvN 


251 


0.538021^^ 


0.5181676 


SA2047 




1043 


0.9596663 


6.9518814 


SA2054 


sigB 


7d7 


" 0.9055591 


0,8926008 


-SA205b 


:rsbW 


cf\ei 
OUD 


0.7891858 


0.7705234 


SA2056 * 


rsbV 


_ J323 _ 


_ 0^6298155 


6.6092186 


SA2057 


"rsbU '7'" . 


998 


' 0^9536013 


0.9451516 


SA2061 
SA2069 


acpS 


?56_ 

107 


0.665555 
0.280501* 


0.6449881 

0.2674788 


.SA2073 


[PV'^'^ 


JI355 


0,9845299 


6.9865847 


•SA2074 




1067 


0.9624759' 




SA2076 




134 


6.3378541 * 


7 0.3228118 


SA2077 




206 _ 


0.4694214 


7 b. 4 507788 


SA2079 


_ c\s _ 


148 1_ 


0.9895013 * 


' 0.9865426 


SA2080 




644 


" * 0.8621176' 


" ' O;?.!?.?.^]^^ 


*.SA2081 


, . 


89_ 


6.2395316 


6.228jl606 


SA2082 




869 


0.9309965 " 


0.9201752 


SA2083 


UhiE 


K>00 


0.8595487 


0.8436944 


;SA2087 




128 


0.3255175 


0.3108884 


ISA2089 




392. 


0.70062 


0.6802859 


iSA2090 


ywpF 


437. 


0.7393275 


0.7195144 


•SA2091 


IfabZ 


437= 


0.7393275 


0.7195144- 


ISA2093 




230 


0.5071875 


0.4878153 


1SA2098 


:atpH 


536 


0.8077728 


0.7897008 


:SA2100 


:alpE 


209* 


0.4742961 


0.455551 


ISA2101 


.atpB 




0.892532 


U.87B6438 


1SA2102 




449- 


0.7487759 


0.7291367 


ISA2104 


iupp 


626. 


0.8542664 


0.8381417 


ISA2108 




1094. 


0.965467' 


0.9585159 


iSA2109 





- o3o- 


. 0.9229144 


.0.9113622: 


ISA2110 


iprfA 


1073 


0.9631622 


0.9559027 


ISA2112 


(rpmE 


OKI 


0.538021 


0.5181676 


5SA2115 


— — 


332. 


0.6399252 


0.6193169 


iSA2117 


'fba 


oo7. 


0.9284013 


0.9173395 


ISA2121 




OOf 


0.9284013 


0.9173395 


ISA2122 








u.yu2Ho I** 


rSA2i31 






0.7417224 


0.7219516 


'SA2132 




410 


0.7167488 


0.696596 


■SA2134 




236 


0.5162013 


0.4966774! 


iSA2135 


:manA 


935 


0.9436772 


0.9341196 


ISA2137 


•czrA 


317. 


0.6229185 


0.602338 


iSA2139 




104> 


0.2738294 


0.2610581 


ISA2143 




824; 


0.9207501 


0.9090109 


ISA2145 


gImS 


18021 


0.9960895 


0.9947104 


iSA2152 




929 


0.9426278 


0.9329597! 
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SANUMBER 

:SA2l6'l 

SA2166 ' ' " " 
•SA2167 7 
:SA2168 
'SA2173 
.SA2175 '2 

SA2182 ' 
'SA2183 
ISA2184 
•SA2185 

SA2187 
:SA2196 

SA2191 
= SA2193 

SA2195 

SA2200 
:SA2201 
:SA2203 [ 

•SA2207"7.~7 
!SA2209'" 



:l GENE NAME^K-i;.; 
arg 



lacD 
lacC 
tacB 



806 

*1199 
~ 1 184 

980 _ 
""1067 
506* " 
533 '"* 

"977 
929 
512 
* Vl3 

122 
" '"413 " 



0.9162375 



0.9041 f96 



0.9382312 



0.9281119 



J)^750002 
0.9738194 



0.9694347 



0.9680715 



.rplM 



851 

92 
"595" 
1133 
' "434" 



80 3 
857 



0.9576922 
0.9509593 
0.9624759 
"6.7891858 
[b.86599b4 
Oj 1*23313 
"0.9505645 
0,9426278 
6.7930417 
"j0.293661 
""0.8445398 ^ 
7,6.312951 
J).7193512 
079270673 " 
0'2465183 
0.840175 
J)^_693717 
^.7369104 
0T9154668' 
0.9284013 



0.9497353 



0.9422031 
"6.9551263 

0.7765 234 

0.7878575 
J0.5917894 
' 07943 6965 

0.9a29597 
707744939 

0^801*533 
JO.82794'57 
_ 6!298755 

6.6992323 



0.915884 



0.2348076 
6.8233816 



0. 962965 2 
0.7170559 
0.903 2792 
0.9173395. 



:SA2212 


:rplQ 


365 


0.6746886 


0.6541621: 


ISA2213 


:rpoA 


941 


0.9447074 


0.9352595 


ISA2214 


•rpsK 


386. 


0.6950422 


0.6746566 


,SA2215 


,rpsM 


362 


0.6716721 


0.6511307 


!SA2216 


.rpmJ 


110 


0.2871114 


0^738437! 


•SA2217 


.infA 


215. 


0.4839115 


0.4649714: 


iSA2218 


^adk 


644 


0.8621176 


0.8463989! 


:SA2219 


.secY 


1289 


0.9810468 


0.9764752' 


iSA2220 


ifpIO 


437 


0.7393275 


' U./195144 ■ 


ISA2221 


irpmD 


176 


0.4181183 


0.4006946| 


ISA2222 


irpsE 


497. 


0.7832669 


0.7644361 


ISA2223 


irpIR 


356: 


0.665555 


0.6449881* 


1SA2225 


^rpsH 


395 


0.7033706 


. 0.6830639: 


iSA2226 


irpsN 


182 


0.4287612 


0.4110641 


ISA2227 


irplE 


488. 


0.7771817 


0.75818731 


ISA2228 


jrplX 


314 


0.619422 


0.6988525 


iSA2229 


.rpIN 


365 


0.6746886 


0.6541621* 


.SA2230 


rpsQ 


260 


0.5506376 


0.5306188- 


ISA2231 


,rpmC 


206' 


0.4694214 


0.4507788' 


iSA2232 


.rpIP 


431. 


0.7344708 


0.7145759: 


1SA2233 


:rpsC 


650 


0.8646395 


0.8490565 


1SA2234 


.rplV 


350. 


0.6593238 


0.6387373: 


ISA2235 


irpsS 


275 


0.5709042 


0.5506598' 


ISA2236 


.rpIB 


830 


0.9221996 


0.9105853: 


ISA2237 


:rplW 


272. 


0.5669254 


0.5467212 


ISA2238 


:rplD 


620 


0.8515512 


0.8352918. 


1SA2239 


irpIC 


626' 


0.8542664 


0.8381417 
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